Low loss and validation loss, yet every prediction is off

MitchellWeg · March 22, 2023, 1:18am

I’m trying to do some image recognition with bounding boxes. I do get a decent(?) score on both the loss and validation loss, all the predictions seem to be (way) off.

My model:

    vgg = VGG16(weights="imagenet", include_top=False, input_tensor=keras.layers.Input(shape=(IMAGE_HEIGHT, IMAGE_WIDTH, 3)) )
    vgg.trainable = False
 model = keras.models.Sequential([
        vgg,
        keras.layers.Flatten(),
        keras.layers.Dense(256, activation="relu"),
        keras.layers.Dropout(0.5),
        keras.layers.Dense(128, activation="relu"),
        keras.layers.Dropout(0.5),
        keras.layers.Dense(64, activation="relu"),
        keras.layers.Dense(32, activation="relu"),
        keras.layers.Dense(4, activation="sigmoid"),
    ])

I’ve tried adding Dropout to the layers, yet nothing seems to really change anything.

I do believe that this is a sign of overfitting.

The plot:
plot

Any tips?

chunduriv · March 22, 2023, 3:01pm

@MitchellWeg,

Welcome to the Tensorflow Forum!

Could you please elaborate on your use case?

Generally VGG16 is one of the popular algorithms for image classification. If your use case is classification, can you try to change the last dense layer of the model as

   keras.layers.Dense(4, activation="softmax")

Thank you!

MitchellWeg · March 23, 2023, 10:23pm

Sorry, I had to elaborate a little bit further.

I’m trying to predict bounding boxes, so it’s a regression problem (I think?), so the activation function shouldn’t be the problem.

Ah, I didn’t know VGG was used for classification.

I updated my model, like so:

    model = keras.models.Sequential([
        keras.layers.Input((IMAGE_HEIGHT, IMAGE_WIDTH, 3)),
        keras.layers.Conv2D(64, 3, activation="relu"),
        keras.layers.MaxPooling2D(),
        keras.layers.Conv2D(32, 3, activation="relu"),
        keras.layers.MaxPooling2D(),
        keras.layers.Conv2D(16, 3, activation="relu"),
        keras.layers.MaxPooling2D(),
        keras.layers.Flatten(),
        keras.layers.Dense(256, activation="relu"),
        keras.layers.Dropout(0.2),
        keras.layers.Dense(128, activation="relu"),
        keras.layers.Dense(64, activation="relu"),
        keras.layers.Dense(32, activation="relu"),
        keras.layers.Dense(4, activation="sigmoid"),
    ])

Which yielded a somewhat healthier learning curve.

Yet, I still have the same problem of the predictions being off.
It makes me think I’m doing something wrong with calculating the bounding boxes.
I calculate the bboxes like this:

During training, I downscale the image using keras:

img = tf.keras.preprocessing.image.load_img(img_path, target_size=(IMAGE_HEIGHT, IMAGE_WIDTH))

Then, since I downscaled the image, I need to scale the bboxes as well:

(h, w) = IMAGE_HEIGHT, IMAGE_WIDTH
top = float(row["top"]) / h 
left = float(row["left"]) / w
height = float(row["height"]) / h 
width = float(row["width"]) / w

(The coords are in format: top, left, height, width)

Then, when I predict:

def make_prediction(img):
    model = tf.keras.models.load_model("my_model")
    img = preprocess_image(img)
    data = np.array(img, dtype="float32") / 255.0
    data = np.expand_dims(data, axis=0)

    pred = model.predict(data)[0]

    left = int(pred[0] * ORIGINAL_IMAGE_WIDTH)
    width = int(pred[1] * ORIGINAL_IMAGE_WIDTH)
    top = int(pred[2] * ORIGINAL_IMAGE_HEIGHT) 
    height = int(pred[3] * ORIGINAL_IMAGE_HEIGHT) 

    coords = {
        "left": left,
        "top": top,
        "width": width,
        "height": height
    }

    return coords

Which, in my mind, would explain the prediction being off by a small amount.

chunduriv · March 29, 2023, 12:55am

@MitchellWeg,

Which yielded a somewhat healthier learning curve.

Your approach for calculating the bounding box coordinates seems correct. However, there might be a few other things that could be causing the issue:

Loss function: Since this is a regression problem, you should be using a regression loss (mse).
Scaling of target values: Make sure that you are scaling the targets in the same way that you are scaling the predicted values during training.
Model capacity: You can try increasing the number of layers, filters in the model to see if that helps improve performance.
Amount of data: You can try collecting more data or using data augmentation techniques to generate more training examples.

Thank you!