Pass at least one tensor to concat issue

Cookie · April 25, 2024, 12:50pm

Im making a discord bot that im trying to train so that it can detect scam messages and it works the only issue i get is this below

Error: Pass at least one tensor to concat
    at assert (/home/container/node_modules/@tensorflow/tfjs/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:454:15)
    at concat_ (/home/container/node_modules/@tensorflow/tfjs/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:9999:5)
    at Object.concat__op (/home/container/node_modules/@tensorflow/tfjs/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:5612:29)
    at /home/container/node_modules/@tensorflow/tfjs/node_modules/@tensorflow/tfjs-layers/dist/tf-layers.node.js:24370:96
    at Array.map (<anonymous>)
    at /home/container/node_modules/@tensorflow/tfjs/node_modules/@tensorflow/tfjs-layers/dist/tf-layers.node.js:24370:49
    at /home/container/node_modules/@tensorflow/tfjs/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4562:22
    at Engine.scopedRun (/home/container/node_modules/@tensorflow/tfjs/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4572:23)
    at Engine.tidy (/home/container/node_modules/@tensorflow/tfjs/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4561:21)
    at Object.tidy (/home/container/node_modules/@tensorflow/tfjs/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:6315:19)

This is all i have for the tensor flow to get an idea what i have and this error doesnt always happen it happens some of the time.

const tensorflow = require('@tensorflow/tfjs-node');
const universalsentenceencoder = require('@tensorflow-models/universal-sentence-encoder');
const WhiskerDataset = [
    {
        message: "Im a scam message",
        scam: true
    },
    {
        message: "Hi greg how are you?",
        scam: false
    }
];

(async () => {
    let useModelPromise = universalsentenceencoder.load();
    const outputData = tensorflow.tensor2d(WhiskerDataset.map(item => [
        item.scam === true ? 1 : 0,
    ]));
    const model = tensorflow.sequential();
    model.add(tensorflow.layers.dense({
        inputShape: [512],
        activation: 'sigmoid',
        units: 1,
    }));
    model.add(tensorflow.layers.dense({
        activation: 'sigmoid',
        units: 1,
    }));
    model.add(tensorflow.layers.dense({
        activation: 'sigmoid',
        units: 1,
    }));
    model.compile({
        loss: 'meanSquaredError',
        optimizer: tensorflow.train.adam(0.06),
    });
    const batchSize = 100;
    const encodedData = [];
    const useModel = await useModelPromise;

    for (let i = 0; i < WhiskerDataset.length; i += batchSize) {
        const batchSentences = WhiskerDataset.slice(i, i + batchSize).map(item => item.message.toLowerCase());
        const batchEncoded = await useModel.embed(batchSentences);
        if (batchEncoded && batchEncoded.size > 0) {
            console.log("Batch encoded:", batchEncoded);
            encodedData.push(batchEncoded);
        } else {
            console.log("Skipping undefined or empty tensor at index:", i);
        }
    }

    const newMessage = "Im another message here to scam"
    const trainingData = encodedData[0]; // const trainingData = await tensorflow.concat(encodedData); // had it as the previous const but was trying other ways to work
    await model.fit(trainingData, outputData, { epochs: 100 });
    const EmbededMessage = await useModel.embed([newMessage]);
    const predictions = model.predict(EmbededMessage).arraySync();
    console.log(predictions[0][0])
})()

Jason · April 26, 2024, 6:22pm

Can you provide more details on when it fails?

Eg is it a specific string that causes it to fail?

Is the string empty or some edge case like that?

Can you check Chrome dev tools to see if there is memory leak (incase something is causing a fail somewhere due to tensors not being cleaned up). You can print out how many tensors are in memory after each inference by logging:

tf.memory().numTensors

Check this is not increasing each time - should be constant if you are disposing everything correctly.

Cookie · April 26, 2024, 6:59pm

The string being used is not empty but i do replace newlines in the string as /n. Sometimes itll get this error on start up when someone messages or Sometimes itll error on certain messages it seems. I don’t have an exact message of what which one did the error but ill let you know when i can. It does also seem to be a memory leak since i debugged it with what you sent and it was raising and not staying at a constant rate

Jason · April 27, 2024, 12:03am

I would narrow those 2 things down first to see if it is some odd edge case that is causing issues.

PS you may find my USE codepen example useful for how I was using it:

I have not experienced any issues there. One thing I notice right away is that you do not await the USE load() at the start of your async function where as in mine i call use.loadI().then(…)

There may be some odd edge case where by things are not ready yet in some situations due to network conditions of downloading model etc vs the training of the custom model you also make.

Cookie · April 27, 2024, 11:19am

This is the output that was given and as you can see its still incrementing even after disposing of it afterwards. Ive also found removing the newline strip code still causes it to do so.

Prediction: 34.97%
Tensor Memory: 86
Tensor Memory: 85
Prediction: 34.97%
Tensor Memory: 87
Tensor Memory: 86

Here is the code im using for it to do so. Im not using it in the browser though and im using it on a vps with nodejs.

const tensorflow = require('@tensorflow/tfjs-node');
const universalsentenceencoder = require('@tensorflow-models/universal-sentence-encoder');
const WhiskerAiDataset = [
    {
        "message": "whats crazy was that was my entry to a halloween contest for a roblox jojo game, and i won first",
        "scam": false
    },
    {
        "message": "polar bears",
        "scam": false
    },
    {
        "message": "I am a malicous scam message muahahahahahaha",
        "scam": true
    },
    {
        "message": "I am another malicous scam message but even more sneakier muahahahahahaha",
        "scam": true
    }
];

(async () => {
    const useModel = await universalsentenceencoder.load();
    const outputData = tensorflow.tensor2d(WhiskerAiDataset.map(item => [
        item.scam === true ? 1 : 0,
    ]));
    const model = tensorflow.sequential();
    model.add(tensorflow.layers.dense({
        inputShape: [512],
        activation: 'sigmoid',
        units: 1,
    }));
    model.add(tensorflow.layers.dense({
        activation: 'sigmoid',
        units: 1,
    }));
    model.add(tensorflow.layers.dense({
        activation: 'sigmoid',
        units: 1,
    }));
    model.compile({
        loss: 'meanSquaredError',
        optimizer: tensorflow.train.adam(0.06),
    });

    const encodedData = []
    const MessagesList = WhiskerAiDataset.map(item => item.message.toLowerCase());
    const batchEncoded = await useModel.embed(MessagesList);
    encodedData.push(batchEncoded);

    const TestMessage = "Hi there fellow friend!"
    const NewLineStripped = TestMessage.replace(/\n/g, '\\n').toLowerCase()
    const concat = await tensorflow.concat(encodedData);
    await model.fit(concat, outputData, { epochs: 100 });
    const EmbededMessage = await useModel.embed([NewLineStripped]);
    const predictions = model.predict(EmbededMessage).arraySync();
    console.log(`Prediction: ${(predictions[0][0] * 100).toFixed(2)}%`)
    console.log(`Tensor Memory: ${tensorflow.memory().numTensors}`)
    await EmbededMessage.dispose()
    console.log(`Tensor Memory: ${tensorflow.memory().numTensors}`)

    const EmbededMessage1 = await useModel.embed([NewLineStripped]);
    const predictions1 = model.predict(EmbededMessage1).arraySync();
    console.log(`Prediction: ${(predictions1[0][0] * 100).toFixed(2)}%`)
    console.log(`Tensor Memory: ${tensorflow.memory().numTensors}`)
    await EmbededMessage1.dispose()
    console.log(`Tensor Memory: ${tensorflow.memory().numTensors}`)
})()

Jason · April 30, 2024, 5:56pm

I believe model.predict() returns a tensor that I do not see you disposing anywhere. You can potentially wrap your prediction code in a tf.tidy if you are not doing any async stuff. You dont need to await for dispose to finish.

Cookie · May 1, 2024, 3:05am

Hey it seems like disposing the prediction worked, didnt think that needed to be disposed of since it wasnt apart of the “useModel” variable. Here is a working version so if anybody runs into this error like me they can get an idea of it. Also thank you @Jason for the help.

const tensorflow = require('@tensorflow/tfjs-node');
const universalsentenceencoder = require('@tensorflow-models/universal-sentence-encoder');
const WhiskerAiDataset = [
    {
        "message": "whats crazy was that was my entry to a halloween contest for a roblox jojo game, and i won first",
        "scam": false
    },
    {
        "message": "polar bears",
        "scam": false
    },
    {
        "message": "I am a malicous scam message muahahahahahaha",
        "scam": true
    },
    {
        "message": "I am another malicous scam message but even more sneakier muahahahahahaha",
        "scam": true
    }
];

(async () => {
    const useModel = await universalsentenceencoder.load();
    const outputData = tensorflow.tensor2d(WhiskerAiDataset.map(item => [
        item.scam === true ? 1 : 0,
    ]));
    const model = tensorflow.sequential();
    model.add(tensorflow.layers.dense({
        inputShape: [512],
        activation: 'sigmoid',
        units: 1,
    }));
    model.add(tensorflow.layers.dense({
        activation: 'sigmoid',
        units: 1,
    }));
    model.add(tensorflow.layers.dense({
        activation: 'sigmoid',
        units: 1,
    }));
    model.compile({
        loss: 'meanSquaredError',
        optimizer: tensorflow.train.adam(0.06),
    });

    const encodedData = []
    const MessagesList = WhiskerAiDataset.map(item => item.message.toLowerCase());
    const batchEncoded = await useModel.embed(MessagesList);
    encodedData.push(batchEncoded);
    const TestMessage = "Hi there fellow friend!"
    const NewLineStripped = TestMessage.replace(/\n/g, '\\n').toLowerCase()
    const concat = await tensorflow.concat(encodedData);
    await model.fit(concat, outputData, { epochs: 100 });

    const EmbededMessage = await useModel.embed([NewLineStripped]);
    const predictions = model.predict(EmbededMessage);
    console.log(`Prediction: ${(predictions.arraySync()[0][0] * 100).toFixed(2)}%`)
    console.log(`Before Tensor Memory: ${tensorflow.memory().numTensors}`)
    EmbededMessage.dispose()
    predictions.dispose()
    console.log(`After Tensor Memory: ${tensorflow.memory().numTensors}`)

    const EmbededMessage1 = await useModel.embed([NewLineStripped]);
    const predictions1 = model.predict(EmbededMessage1);
    console.log(`Prediction: ${(predictions1.arraySync()[0][0] * 100).toFixed(2)}%`)
    console.log(`Before Tensor Memory: ${tensorflow.memory().numTensors}`)
    EmbededMessage1.dispose()
    predictions1.dispose()
    console.log(`After Tensor Memory: ${tensorflow.memory().numTensors}`)
})()

Jason · May 1, 2024, 9:00pm

Awesome glad it was useful!