Hi! Can i set more load on CPU? By default ~25% on multicore with @tensorflow/tfjs-node": "^3.12.0
Maybe @Matthew_Soulanille knows the answer to this one?
Hi @gotostereo . Are you observing 25% of your threads in use at 100% or 25% usage on all your threads?
I initially thought this might be caused by a disk bottleneck in your code, such as when reading training data. However, I just tried our mnist-node example which stores all training data in memory, where I saw only 50% cpu usage on each thread (16 thread machine). This might be a bug with tfjs-node, or perhaps there’s an inefficiency in the fitLoop function (which trains the model) when it’s using the node backend.
I also tried the same example with tfjs-node-gpu and saw far less speedup than expected (only ~1.5) and only about 25% gpu utilization. I saw two node processes, one at 20% cpu, which I think is feeding the GPU, and the main one at 120%. This indicates to me that there’s a bottleneck in how we’re distributing work to the GPU (or threads in your case), but I’ll have to look into it more to be sure.
Looking at performance profiles, most of the time is spent on
NodeJSKernelBackend.getInputTensorIds. I’m guessing there’s too much overhead on these functions, and that’s what’s slowing down the rest of the threads. I’ll see if anyone else on the team has thoughts on this.
Windows 10 x64, latest node js, intel 3770k, all threads work on ~25% .It would be great if it could be controlled.
This is less of a problem for inference, where using a graph model can more evenly distribute the work, but tfjs does not yet support training graph models.