What is happening to my training?

So I’ve been experiencing a strange thing. All my trainings are getting vanishing gradient at (almost) the same epoch using relu, elu, selu, tanh activation functions. The loss jumps to an increased value and gradients stops change. Even when I set a batch normalization layer (less pronounced loss increase). The same occurs for any of my 10 cross validation slices. I have double checked and absolute loss values change between different activation functions and/or cross validation slices, but not the behavior.

`Model: “sequential”

Layer (type) Output Shape Param #

batch_normalization (BatchNo (None, 17) 68

hidden1 (Dense) (None, 36) 648

hidden2 (Dense) (None, 36) 1332

dense (Dense) (None, 1) 37

Total params: 2,085
Trainable params: 2,051
Non-trainable params: 34`

Have you double checked if your are shuffeling your dataset correctly? Have you checked if you are getting nan loss values?