Binary masked loss based on episode input data

Hello. I have different scenarios with the same problem for time series forecast. I have time driven episodes with (ncols) columns of BINARY values (0/1) with a fixed number of columns (nones) set to 1. In the simplest scenario I have file episodes with ncols=49 and nones=6. In the complex scenarios have data with ncols=512 and nones=16.

I have to forecast the next episode, that is ncols outputs, but ONLY the ones predicted are relevant.

I tested several LSTM models with different optimizers, losses (even with extreme settings) and metrics but ALL end up having the same binary accuracy = 1/ncols*(ncols-nones). In the case of ncols=49 and nones=6, the binary accuracy is 0.8776. That means that the system ends up predicting ZERO for ALL output neurons. Also have made many different tests considering the input data as numbers, (instead True/False) without any improvements in accuracy.

I foresee that the solution is to implement a custom loss function where each column of each episode is masked to change its probability, so in case the real value of the neuron is 0 (probability * zero_factor) or in case the real value is 1 ( probability * one_factor), but I have a very limited knowledge of Python and have not been able to find a useful loss function in Internet.

I will really appreciate any kind of help, and will provide data and source if requested.