Some ideas for how to draw bounding box

Mah_Neh · February 14, 2023, 9:32am

I have been trying to draw bounding boxes around objects using vgg16 and a fully connected regression layer. And freezing the inner layers.

I am not classifying, this is just a single object I need to draw the box around.

For some reason this is not very good at Barcodes, and the boxes are quite a bit off. So I am pondering:

to use a small custom neural network instead

find some keras application (NNs) where I could train and save the model to reuse later

I have used Javascript but now realize that because of implemented models I may just train in Python and then save the models.

Is there a model that could be recommended for this task (drawing bounding box around an object?)

Tanya · February 16, 2023, 12:01am

Hi @Mah_Neh ,

In object detection, we usually use a bounding box to describe the spatial location of an object.

Below tutorial fine-tune a pre-trained RetinaNet model with ResNet-50 as backbone for object detection and also saves and exports the trained model which can be reused later.

You can follow the similar approach in the above tutorial for your use case.

Also, you can try any of the models in the official object detection section of Model Garden Library.

Refer to the configs for above official model here and edit the yaml files based on your requirements: models/official at master · tensorflow/models · GitHub

Let me know if this helps.

Mah_Neh · February 16, 2023, 1:42am

I will try to do it tomorrow. Thank you

Mah_Neh · February 16, 2023, 11:41am

I dont know if you are up for a more extended discussion. I did this previously:

Load VGG16 without the output layers
Add a regression head
Train with 400 faces images using as label a vector with the 4 the coordinates
Predict on new face pictures and the test set

This approach seems to just not work with any image that is not from the dataset, and even within the dataset, it seems not to get the box small enough around the face.

I have fine tuned it, added more images to the dataset (from roboflow) and so on. Still wont get good predictions. I have read also faq - What should I do when my neural network doesn't learn? - Cross Validated and I am a bit discouraged and not finding what else to do.

I think the code is correct because it predicts fine for test images from the same dataset, but terribly bad for other images from elsewhere.

Tanya · February 16, 2023, 11:01pm

Hi @Mah_Neh ,

The above posted link covers almost all the work around one should perform to make the model learn.
I wanted to add a few more tips which have greatly helped me in such a scenario.

Increase the sample size , more the data better is the learning . And try to generalize the samples so that it creates a generalizable model.
Try to augment and normalize the dataset to increase the number of samples. Data augmentation and normalization are two prevalent techniques used to improve generalizability.
Regularization and reducing the architecture complexity are two other methods commonly used to prevent overfitting.
Try Trainable to True , in my experience, I always got better performance (lower error in regression) when setting Trainable=True .

If none of the above tips works , we need to inspect the model architecture (which is fixed in our case).
Let us know if the above tips works for you.

Thanks

Mah_Neh · February 16, 2023, 11:37pm

I followed it but find it rather cumbersome in terms of notation to be honest, yet I will try it. At the moment I just tried YoloV4 from pytorch implementation, and seems rather easy to run and get results from it.

We will see.