LogMSE predictions; last layer adjustment?

Hi all :),

I have a bit more general question and I couldn’t find anything online dealing with my issue:

I train a RCNN for a regression task (no activation in last layer) and the distribution of my target (measured on a scale between 0 and 10000) contains some very large values and in addition to being heavily zero-inflated. Therefore, when using MSE as a training loss, this often produces nans during training (which can be tamed with gradient clipping of course), but a quick fix which proved to work nicely is to use the logMSE as a training loss instead.

However, once I use model.predict(), the output of the model appears to be in logs as well. Of course, I can use the exp() of the predictions, but I wonder if I am on the right track here. In addition, this approach produces a few nans.

So my two questions are:

  1. Is it correct that model.predict() of a model trained with logMSE yields log itself and 2. what is the correct way to obtain predictions in the original scale?

Thank you!
Best
NW