Supplying custom benchmark tensor to loss/metric functions

SnakeIsTheName · December 2, 2021, 8:16pm

Certain loss/metric functions like UMBRAE and MASE make use of a benchmark - typically the “naïve forecast” which is 1 period lag of the target.

However in my dataset, I’m using hourly data to train/predict monthly returns. So in essence my naïve forecast isn’t 1 row behind, it’s N rows behind where N can change over time, especially when dealing with monthly timeframes (some months are shorter/longer than others).

I already have a feature called bars_in_X where X is one of D, W, M, Y respectively for each timeframe (though for the sake of argument, I’m only using M). It’s an integer that references the 1-period-ago row wrt the timeframe. So for bars_in_D, that would typically be 24 (as there are 24 Hours in 1 Day). i.e., the naïve forecast for the hourly value NOW happened 24 bars ago.

As a halfway measure, I find the mean of each of those features in the dataset and before creating the model I make custom loss functions that are supplied this value (see how here). This produces a usable, but technically incorrect result because it’s a static backreference as opposed to the dynamic bars_in_X value. It would also be an insufficient method for when I eventually want to find the naïve forecast for ALL timeframes (not just one).

Does anyone have a suggested method of handling this kind of situation?

markdaoust · December 3, 2021, 8:02pm

My first guess is that your “loss function” should be an an instance of a class that has a build-in circular-memory buffer implemented in a tf.Variable.

Make the buffer large enough that you always have the record you need to go back to look at.

If you’re using keras, you’ll need to train_step so you can thread the bars_in_x feature through to the loss function.