LogMSE predictions; last layer adjustment?

Hi all :),

I have a bit more general question and I couldn’t find anything online dealing with my issue:

I train a RCNN for a regression task (no activation in last layer) and the distribution of my target (measured on a scale between 0 and 10000) contains some very large values and in addition to being heavily zero-inflated. Therefore, when using MSE as a training loss, this often produces nans during training (which can be tamed with gradient clipping of course), but a quick fix which proved to work nicely is to use the logMSE as a training loss instead.

However, once I use model.predict(), the output of the model appears to be in logs as well. Of course, I can use the exp() of the predictions, but I wonder if I am on the right track here. In addition, this approach produces a few nans.

So my two questions are:

  1. Is it correct that model.predict() of a model trained with logMSE yields log itself and 2. what is the correct way to obtain predictions in the original scale?

Thank you!
Best
NW

Hi @nmwitzig

Welcome to the TensorFlow Forum!

Model.predict() predicts the probabilities of the associated classes(labels) and which label has the highest confidence value , will be assumed as the index of the correct label. Please refer to this prediction doc to understand the prediction functionality.

Could you please share minimal reproducible code to replicate and understand the issue better if the issue still persists? Thank you