Model Overfiting with LSTM layers

I am trying to train a model over human skeleton data and was able to achieve good accuracy overtraining but the validation loss reaches a point and starts to increase again. Model validation accuracy doesn’t decrease over time. Clearly, it overfitting and I totally understand. For reducing this I tried most of the techniques but was not able to decrease the validation_loss. I had tried dropouts, reducing model capacity, and adding loss for layers but no luck.
Log graph would be seen below

Any ideas to improve the model??

Have you tried to augment the dataset?

Yap, I have tried that too. But I fell for this model parameter it’s more data. Maybe I could be wrong.

Have you tried to overfit also the train+validation?

No when i did data augmentation, train seems to be fine but the valdication loss only increases back

Are you handling a 2d or 3d pose dataset?

its 3d pose but it has been reshaped to 75 (i.e. 25 key points * 3)

I don’t know your dataset but If you cannot collect more train data to cover your validation distribtuion you can try with some interesting augmentation approach like:

https://arxiv.org/abs/2105.02465

I am using the NTU-RGBD dataset for training. According to your idea what should be validation distribution. My dataset size is around 18000 samples and split 80:10:10. Also model parameters is around 210,864.

Are you traning on NTU-RGBD and evaluating on your own custom dataset?

I am using NTU-RGBD for both training, and validation.

Was in your graph the loss/accurancy on the Action recognition or on the keypoints?

it was on action recognition because i have considered 20 classes for prediction

Have you corrently sampled/balanced all the classes in training set?

yes, I have considered while preparing dataset

Have you tried to build the confusion Matrix or the classification error for each class to check how It is distributed?

This is data distribution, numbers respond to classes.
Counter({14: 850, 16: 849, 9: 848, 18: 848, 5: 848, 4: 847, 6: 847, 17: 846, 1: 845, 19: 845, 3: 845, 15: 845, 12: 844, 8: 844, 10: 844, 2: 843, 0: 841, 11: 840, 13: 837, 7: 834})

Yes but I meant how the validation error is distributed over classes.

Before debugging your custom model have you tried to reproduce approximate results with any well known model on this dataset?

how can i get validation error over distributed classes. I mean how to visualise loss over classes.

You can play the confusion matrix preparing the validation GT label and predictions