Weight normalisation in Custom Layer

I am building a custom GRU model. I have finished training my CHBMIT EEG data. The weights obtained after training have different max value (for different patients). However , I need the weights to be in range of -1 to 1.(for fixing hardware).
After completing the training process, I did weight Normalisation on the final weights and used them on my test data. I am getting good result based on Accuracy, Sensitivity, Specificity, F1-score.
However, my question is since I have normalised after training , my model has changed, then how i am still getting good results on both test and train data??

Hi @Lakshmi_Iyer ,

Here are my thoughts on weight normalization:

When you normalize the weights of a neural network, you are essentially changing the scale of the weights, but not the underlying relationships between the weights. This means that the model is still able to learn the same underlying patterns in the data, even though the weights are different.

However, if you are getting good results on both the train and test data, then there is no need to worry. The model is still able to learn the patterns in the data and make accurate predictions.

Here are some additional things to keep in mind:

  • The amount of normalization that you need to apply will depend on the specific data set and the model that you are using.
  • You should always normalize the weights before training the model, if possible. This will help to improve the stability of the training process and prevent overfitting.
  • You can also normalize the data before training the model. This can also help to improve the performance of the model.

Hope this helps!


Thank you for your response.

My data is already normalised.

Now what I have done is , within epoch loop, after each time Adam optimisation is run – the weights are updated-- and I’m reading the weights from my model, normalising it and assigning back to the model.
So, in this way I’m normalising weights in the training loop itself.

Once entire training is done… I’m checking all performance parameters on test data as well as train data.

Hope this is a better approach.
Please let me know

Hi @Lakshmi_Iyer,

This approach can work, and it’s a way to ensure that your model’s weights are in the desired range throughout the training process. Just be aware that it might add some computational overhead due to the additional weight normalization step during each epoch.

However, it is important to note that you should not normalize the weights too much, as this can also hurt the performance of the model.


I am building a custom GRU model for CHBMIT EEG data. I have done data balancing and then split the data in train(80%) and test(20%) without shuffling as I want to maintain the time series. I am not using any validation set.

Data that I used has 800 training samples and 200 testing samples

In the training loop, I’m doing the following

  1. Hyper parameter optimisation (for epochs, lr)
  2. Weight normalisation and quantization (1sign,1int,8fract – fixed pt representation) to be used later for my hardware
  3. Weight update: batch wise update (say after 20 predictions update the weight). Optimizer: Adam and Loss: Binary cross entropy

After training,
I use the final trained weights on my test data.
Get the prediction and find the auc-roc on test data and get threshold.
Using that threshold, I evaluate the performance (sens, specf, f1-score,acc, precision)

I also check the performance of my train data using the above threshold(note: that was generated on test data)

I get good rest data performance.
But train data performance is very bad. Why is this happening? Model has already seen train data, so it must give best results.
Is the auc-roc approach right???

Please let me know.

Thank you.