CNN model predicting values not in the dataset

I’m doing a bounding box prediction regression problem using a CNN. However, the model predicts values that are way off, and are not even in the dataset.

The head of the dataframe:

(base) C:\Users\mitch\Documents\Projects\helmets-v2>python src/main.py                                                                                                                   
image      left     width       top    height   
57503_000116_Sideline_frame490.jpg  0.772549  0.019608  0.674510  0.025490                                                                    
1  57503_000116_Sideline_frame490.jpg  0.917647  0.021569  0.566667  0.023529 
2  57503_000116_Sideline_frame490.jpg  0.739216  0.017647  0.527451  0.023529                                                                            
3  57503_000116_Sideline_frame490.jpg  0.774510  0.017647  0.523529  0.019608                                                                          
4  57503_000116_Sideline_frame490.jpg  0.786275  0.023529  0.596078  0.019608 

Then, to generate the datasets, I use the flow_from_dataframe methods:

 data_gen = keras.preprocessing.image.ImageDataGenerator(
                dtype="float32",
                rescale=1/255,
                horizontal_flip=True,
                validation_split=0.2
            )

            train_gen = data_gen.flow_from_dataframe(dataframe=chunk,
                                                    directory=f"{IMAGES_DIR}",
                                                    x_col="image",
                                                    y_col=["left", "width", "top", "height"],
                                                    subset="training",
                                                    has_ext=True,
                                                    batch_size=BATCH_SIZE,
                                                    target_size=(int(NEW_WIDTH), int(NEW_HEIGHT)),
                                                    class_mode="other",
                                                    # save_to_dir=f"{IMAGES_DIR}/augmented",
                                                    shuffle=True,
                                                    seed=42,
            )

            val_gen = data_gen.flow_from_dataframe(dataframe=chunk,
                                                    directory=f"{IMAGES_DIR}",
                                                    x_col="image",
                                                    y_col=["left", "width", "top", "height"],
                                                    subset="validation",
                                                    has_ext=True,
                                                    batch_size=BATCH_SIZE,
                                                    target_size=(int(NEW_WIDTH), int(NEW_HEIGHT)),
                                                    class_mode="other",
                                                    shuffle=True,
                                                    seed=42
            )

This is my model:

 model = keras.models.Sequential([
            keras.layers.Input(shape=(NEW_HEIGHT, NEW_WIDTH, 3)),
            keras.layers.Conv2D(filters=64, kernel_size=(3,3), padding="same", kernel_regularizer="l2", activation="relu"),
            keras.layers.MaxPool2D(),
            keras.layers.Conv2D(filters=32, kernel_size=(3,3), padding="same", kernel_regularizer="l2", activation="relu"),
            keras.layers.MaxPool2D(),
            keras.layers.Conv2D(filters=16, kernel_size=(3,3), padding="same", kernel_regularizer="l2", activation="relu"),
            keras.layers.MaxPool2D(),
            keras.layers.Dropout(0.2),
            keras.layers.Flatten(),
            keras.layers.Dense(64, activation="relu"),
            keras.layers.Dropout(0.2),
            keras.layers.Dense(32, activation="relu"),
            keras.layers.Dropout(0.2),
            keras.layers.Dense(16, activation="relu"),
            keras.layers.Dropout(0.2),
            keras.layers.Dense(4),
        ])

And this is how I train:

        opt = keras.optimizers.Adam(learning_rate=0.0001)

        self.model.compile(
            optimizer=opt, 
            loss="mse",
        )

Training for 5 epochs yields the following results:

50/50 [==============================] - 17s 272ms/step - loss: 1.0056 - val_loss: 0.8319                                                                
Epoch 2/5                                                                                                                                                50/50 
[==============================] - 13s 260ms/step - loss: 0.8965 - val_loss: 0.7225                                                                
Epoch 3/5                                                                                                                                                50/50 
[==============================] - 13s 260ms/step - loss: 0.8164 - val_loss: 0.6626                                                                
Epoch 4/5                                                                                                                                                50/50 
[==============================] - 13s 260ms/step - loss: 0.7342 - val_loss: 0.6168                                                                
Epoch 5/5                                                                                                                                                50/50 
[==============================] - 13s 260ms/step - loss: 0.6926 - val_loss: 0.5710     

I thought, since the val_loss is slowly going down, the predictions must be getting closer.

However, upon predictions, it predicts values that are just way too big, or don’t make any sense at all (like negative values)

[[203.84532   26.675478 116.79072  -11.553452]]    

I don’t quite understand what I’m doing wrong here. I’m scaling the bounding boxes appriopriately, and the validation loss is quite low. I don’t understand how the predictions can be so off.

Hi @MitchellWeg ,

Below are my thoughts that would be helpful for improving your model’s performance.

There are a few things that you could try to improve the accuracy of your bounding box prediction model.

Normalize the target values: Instead of scaling the bounding box coordinates, you should normalize them so that they fall between 0 and 1. This will make it easier for the model to learn the correct scale of the bounding boxes.

Increase the number of epochs: Your model’s validation loss is still decreasing after 5 epochs, so you might want to try training for more epochs and see if the performance improves.

Try different regularization: Regularization can help prevent overfitting, which could be causing the model to make predictions that are too extreme. You could try increasing the strength of the L2 regularization, or adding other forms of regularization such as dropout or data augmentation.

Check the data: Make sure that the data you are feeding into the model is correctly labeled and formatted. It’s possible that there are errors in the data that are causing the model to make incorrect predictions.

Increase the batch size: You could try increasing it to see if it improves the performance.

Use different loss function: The default loss function for regression is mean squared error (MSE), but this may not be the best choice for your problem. You could try using a different loss function, such as Huber loss or L1 loss, which are more robust to outliers.

Try using a different optimizer: The default optimizer for regression is Adam, but this may not be the best choice for your problem. You could try using a different optimizer, such as Adagrad or RMSProp, which are more robust to noisy data.

Try a different architecture for your model: The model that you are using is a simple convolutional neural network (CNN), but there are other architectures that may be more suitable for your problem.

Using more data: The more data that you have, the better your model will be able to learn. You could try collecting more data or using data augmentation techniques to increase the size of your dataset.

I hope this helps you.

Thanks