Hi all :),
I have a bit more general question and I couldn’t find anything online dealing with my issue:
I train a RCNN for a regression task (no activation in last layer) and the distribution of my target (measured on a scale between 0 and 10000) contains some very large values and in addition to being heavily zero-inflated. Therefore, when using MSE as a training loss, this often produces nans during training (which can be tamed with gradient clipping of course), but a quick fix which proved to work nicely is to use the logMSE as a training loss instead.
However, once I use model.predict(), the output of the model appears to be in logs as well. Of course, I can use the exp() of the predictions, but I wonder if I am on the right track here. In addition, this approach produces a few nans.
So my two questions are:
- Is it correct that model.predict() of a model trained with logMSE yields log itself and 2. what is the correct way to obtain predictions in the original scale?