Test_on_batch() gives the same loss output on different batches in the loop

It seems there is a bug in test_on_batch().

I noticed the problem when I got a straight horizontal line on plotting the test results on the trained network. I used the sequential models. The Tensorflow version is 2.10.0.

I use train_on_batch(), which gives converging losses. When I switch to test_on_batch(), the losses remain the same for different batches. When I restart the test with different test files, it will give a different loss value.

Here is the code of the section:

    print('mfccs3 value = ', tf.keras.backend.eval(mfccs3[1,:]) )               
    #logs = vadModel.train_on_batch(mfccs3, vadLabel)
    logs = vadModel.test_on_batch(mfccs3, vadLabel) 
    print('string logs = ', str(logs))

The result is:
index = 1

mfccs3 value = [[-8.2793800e+01 -5.9538417e+00 9.8302096e-01 -3.5255635e-01
3.0392697e-01 -6.4597696e-01 2.2358397e-02 2.5344249e-02
-6.8171650e-01 -3.7053981e-01 -3.4044239e-01 -8.1056818e-02]]

string logs = 0.2398043

index = 2

mfccs3 value = [[-69.159195 -2.2269542 4.2501264 -1.3486748 0.62957734
-3.2606528 -3.253118 -3.5308673 -1.1313365 -1.1839466
-2.330786 -1.6313086 ]]

string logs = 0.2398043

index = 3

mfccs3 value = [[-64.894104 -1.892648 0.11392474 -0.81098145 -1.4640433
-1.1901256 -1.7744782 -0.85753983 -0.9694403 -0.8149232
-1.0680746 -1.0442001 ]]

string logs = 0.2398043

You can see that the inputs for test_on_batch() have changed. However, the loss remains the same. I use the same code for train_on_batch(), which seems to be fine.

I am new in the area. Please help. Thank you in advance.

I dug a little deeper by comparing the source codes of train_on_batch() and test_on_batch(). I couldn’t find any issue in test_on_batch().

@G_Young Welcome to Tensorflow Forum!

Please let us know if you still require any assistance regarding the issue mentioned above.

Hi Tanya,

Thank you for asking. The issue is not solved yet. I tried to skip test_on_batch() and worked on predict_on_batch(). It has the same problem: the predict outputs are the same for different batches of input data.
I reported the issue as a bug on github.com Today, I was asked to provide the code to reproduce the it. I will try to send in the code for replication later.

I think train_on_batch( ) has the same problem: it only uses the first batch of data for training. The model is not trained even though the error is getting smaller and smaller.

Hi Tanya,

I still need your help. The bug is reported on github. com Here is the link bug link. It was reproduced. However, the bug is not fixed yet.

This is what I found: All of the three on_batch( ) functions, train_on_batch( ), test_on_batch( ) and predict_on_batch( ), have the problem. They just use the first batch of data. These functions were there for many years. The bug must be newly introduced. Could you please have someone to look into it? I need to these functions for my project. Thank you!