What is the relationship between LSTM memoring units and training performance?

Is it the more memoring units, better training performance?

More units may give a performance boost. You could try building multiple models with varying memory units and pick the best one. Adding too many units might cause your network to overfit by memorizing your training dataset.

Thank you very much! :grinning: