Prediction gave same result as output whatever the value I request

Math · May 25, 2023, 8:32pm

Hello,

I’d like to understand where is my mistake on training and predicting from a model.
My goal is to predict a value to be either odd or even, (or out of vocabulary), so a kind of classification.
But after training is fully completed I test my model using unseen values and the prediction output gave me unexpected and always the same result.

Note this is only for testing and the final accuracy does not matter to me.

Here is what I did using nodejs tensorflow Api :

I rougthly have 10000 values on dataset, half of data is labeled as “odd” and the other half is “even” ; do you think it is enough ? how many input should I train for such use case ?
Values are positive float values
min value 0.0016584342301295685
max value 9.998886873517526
On my data I took parseInt(floatValue) so that I consider only the integer value for the label Odd/Even => e.g.: 1.2 is Odd as well as 1.9
I set 90% training and 10% validating from that dataset ; and I pick training data randomly (after shuffleling).
I set 50 epochs and 20 batch size… just pick those values for testing.
Training gave me : 0.7347 as loss and 0.4969 as accuracy… So I’m expecting to have 50% errors on predictions … very bad, but again, I don’t care because I’m just testing my development.
But when I’m requesting my code to predict from unseen values I got the following results :

1.2 ==> Odd with a score of 0.481
1.5 ==> Odd with a score of 0.481
1.9 ==> Odd with a score of 0.481

FYI: The model is the following :

		const model = tf.sequential();
		model.add(tf.layers.dense({
			inputShape: [ 1 ],
			units: 1,
			activation: "relu"
		}));
		model.add(tf.layers.dense({
			units: 3, // because I consider 3 possible values "odd", "even" or "oov" for 'out of vocabulary' either it does not make a lot of sense in this context
			activation: "softmax"
		}));

What do you think ? where could be my mistakes ?

Thanks.

chunduriv · May 26, 2023, 1:46pm

@Math,

It seems the issue with your model and training approach.

Could you please try as suggested below and let us know?

Increase the size of your dataset if possible for better model performance
The model is very simple.Try adding more layers to the model and see if that improves the model performance
The model is not trained enough. 50 epochs may not be enough to train the model to accurately predict unseen values. Try increasing the number of epochs and see if that improves the model performance

Thank you!

Math · June 1, 2023, 8:25pm

Hi @chunduriv, and thank you for your suggestions.

I took a very long delay to answer your post, as I have tested several use cases without taking notes and then I did the tests again so I could get back on this post.

Unfortunately, there is something I don’t understand as my tests results does not help me that much and I still have unexpected predictions, at least the accuracy remain low.
On the following picture, I list the tests I conduct and the accuracy + loss results I got from the training+validation :

The best accuracy I could get is 0.7312

I also have tested adding one or 2 dense layer with sigmoid activation and random units numbers. but it does not seems to have big impacts on results.

Do you have any idea what could be innacurate ? (I’m still considering my code is not well implemented, but it’s going to be complicated to share you the source code as it is integrated into a complex platform)

Thanks

rcauvin · June 4, 2023, 6:37pm

A couple considerations.

First, how are you encoding the target values? As strings (“odd”, “even”, and “oov”) as indicated in a comment in your code? Or as simple integer encoding (e.g. 0 for even and 1 for odd)?

Second, it appears you’re using binaryCrossentropy for the loss function, but depending on your encoding, the number of possible classes may exceed two, in which case you should probably use categorical cross-entropy.

Math · June 4, 2023, 6:47pm

Thanks for feedbacks.

how are you encoding the target values? As strings (“odd”, “even”, and “oov”)
I’m using tf.oneHot( which is based on the index of the string in my vocabulary array

Second, it appears you’re using binaryCrossentropy
good point ; in my training dataset I have the following balance, so I do not have trained any “oov” at all. I should probably remove it fro the output classes ?

"training_balance_number_of_data": {
                "oov": 0,
                "Odd": 106816,
                "Even": 106455
            }

the number of possible classes may exceed two
Actually I also have made some tests using :

binaryCrossentropy : best accuracy: 0.7312 ; the one in the screenshot above
categoricalCrossentropy : 0.5825 ; so it perform worse

I’m currently doing additional tests adding a layer

Math · June 13, 2023, 5:49pm

Hello,

I have performed additional tests and still having bad results without understanding what should be optimized.

On my use case I canted to predict either the number is odd or even, so instead of training float values I now am training integer, should make more sense.

Additionally, I believe (correct me if I’m wrong) that training with a normalized value does not make sense in my case, since normalization will output a [0…1] float value in training and I want to train+predict odd/even values. So I have remove in my code the normalization.

Here are the results I had according to each training. Any feedback or advice is welcomed.