I am trying to build a machine learning model (LSTM) with tensorflow which predicts a single number from a time series of numbers.

First of all, you can imagine my dataset to look something like this:

Index | time data | x data | y data |
---|---|---|---|

0 | `np.ndarray(shape (1209278,) )` |
`np.ndarray(shape (1209278,) )` |
`numpy.float32` |

1 | `np.ndarray(shape (1211140,) )` |
`np.ndarray(shape (1211140,) )` |
`numpy.float32` |

2 | `np.ndarray(shape (1418411,) )` |
`np.ndarray(shape (1418411,) )` |
`numpy.float32` |

… | … | … | … |

Basically, I have time data and at eacht time step I have a corresponding x data point. For each time sequence I want to predict the corresponding single number found in y data.

Easily said I just want my model to predict a number from a time sequence of numbers.

For example like this:

- array([(time_step_1_1, x_val_1_1), (time_step_1_2, x_val_1_2), …]) => y_val_1
- array([(time_step_2_1, x_val_2_1), (time_step_2_2, x_val_2_2), …]) => y_val_2
- …

In this example x_val_1_1 means the first x value of the *first sequence* of data in my dataset, x_val_1_2 means the second x value of the *first sequence* and so on.

On the other hand, x_val_2_1 means the first value of the *second sequence* of data and so on, I think you get it.

It is important to notice, that my x data arrays are NOT of the same length (as you can see in the table above).

I also have a Google colab notebook with a minimal example, which would probably really helpful to understand what I want to do. It can be found down below.

In my current attempt I have used ragged tensors from tensorflow which seem to be a good choice because “They make it easy to store and process data with non-uniform shapes like: Batches of variable-length sequential inputs”.

Currently I did not use the time data yet, because it was possible by only using the x_data without paying respect to the time data.

But turns out: It is not. I was able to train my model on very powerful hardware, but the results looked like this:

In this graph the blue line is the one that’s the data I want to predict, so my y data. The orange line is the result my model currently actually predicts. So it seems like my model tries to find the best *constant* value to fit the curve rather than fitting the actual curve directly.

This fact combined with the problem that I had a lot of out of memory errors like this: `OOM when allocating tensor with shape[22119477696] and type uint8 on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc`

although I was using very powerful hardware (also have a look at this part of the error message `shape[22119477696]`

, which clearly is an insanely big array).

Clearly something is wrong with my model, but I don’t know what and why.

All in all I hope that somebody with a bit more experience could know how to tackle this problem, unlike me. Thank you for your time. Your help is greatly appreciated.

Thanks in advance!