Hi, I have image data set for facial expressions thermal image classification with five different class. I already apply some image augmentation but again the training accuracy are increasing and the val are not. And sometime both constant on 0.20%.
Hi @Imran_Ullah ,
It typically indicates an overfitting problem. Overfitting occurs when a model learns to perform well on training data but fails to generalize well to unseen data.
We need to do hyper-parameter tuning for the model to become stable on unseen data.
You can follow the below steps and check if it is working out or not:
- Increase your dataset
- Explore different augmentation techniques
- Balance your dataset
- Use regularization techniques
- Hyperparameter tuning
Meanwhile, you can explore the TensorFlow official/models github repository for various pre-trained models and checkpoints and also you can find the TensorFlow Image classification tutorial for more insightes.
And if possible, if you could give more details on your problem, such as what kind of model you are using and the size of the data.
I hope this helps.
The total thermal or infrared images is 500, and I increase the images of each class upto 2000, but again the they not generalize on unseen data.
I use resnet50, vgg16, and other architecture.
Using the learning rate 0.0001 batch size 16,32.
So should I need to rearrange my dataset in this format.
training_data= total images 400
Test_data= total images 50
Val_data= total images 50
My previous data are look like this.
training_data= total images 401
Test_data= total images= 39
Val_data= total images 59
Hi @Imran_Ullah ,
It’s clearly showing that the reason is a lack of training data. You must provide enough data to model to find out the patterns in the data so that it will works very well on unseen data.
Thanks , but I already increase each class data upto 2000. Or should I need to try different augmentation methods.
Hi, @Laxma_Reddy_Patlolla . I increase my dataset for training 49000 and for val 2050 , now the resnet50,
Showing the training accuracy for 1 epoch is 88% and for val still 23%.
Should I need to increase the val data or do any other changes?
Whats your suggested best practices for balancing your dataset?
Let say I have classification dataset, which have 5 category, so for each category there will the same number of images. Like 2000 images will belong to sad category, and other 2000 images will belong to happy category and so on. If I am wrong so please correct me.