Online Inference for LSTM

Flare · March 13, 2024, 4:27am

I am performing gesture recognition using an LSTM network. So the input shape of the network is (1, 192, 6), which implies 1 gesture sample with 192 timesteps and 6 features. Is there any way that I can use this network to accept gesture samples of 48 timesteps 4 times (48*4 = 192) and then predict the gesture?

I want to do this since I don’t want the model to wait until I receive all the timesteps but process the timesteps as the come in real time.

Uma_Jay · March 17, 2024, 3:15am

The only thing I can think of is to do (1,48,6) X 4 then run another model with (1,4,6) to get the final result. Breaking up a DNN into smaller DNNs would be very useful, but I think parallelism like that is very tricky.