A tensorflow pretrained object detection model like ssd_mobilenet_v2 is trained on COCO dataset and make prediction on 90 different classes defined in its labelmap file.
I’ve finetuned it with a set of new images : “helmet” and “heads” (these classes does not exist in COCO Dataset.). The result is a new model that works fine, but only makes prediction on these 2 new classes.
My question : How to get a new model that contains 90+2 = 92 classes.
(90 class from the pre-trained model + my 2 new classes)
You must provide an exact number of classes that your model is going to detect. By default, it is equal to 90 because the pre-trained model is supposed to be used for 90 objects within the COCO dataset.
So you need to update num_classes to 92.
We should create category index dictionary to map the labels to corresponding label names.
Following your suggestion :
1/ I’ve updated num_classes to 92 as requested.
2/ I’ve also created the category index dictionary starting with the original label_map.pbtxt from coco dataset (90 classes) and adding my 2 classes at the end :
While fine-tuning If the data doesn’t have the previous 90 classes data it may have some effect on the model. Because original model has been trained on 90 classes and now the weights will be adjusted for just the two new classes.
My initial question was “How to keep original classes when finetuning a model”
The idea behing was :
Making prediction with a 90 classes model takes x ms.
Making prediction with a 2 classes model (helmet + head) takes y ms.
So making preductions on 92 classe by querying 2 models takes x+y ms.
My feeling was that the response time would be better if I manage to create a single model with my 92 classes.
Am I right, and how to achieve it ?
And yes, your intuition is correct. Querying a model with 92 classes is not going to much slower than querying one with 90 classes whereas querying two models is going to be slower. Also, the answer from two models isn’t going to answer your question. Because no matter what input you give the softmax values from each model will sum to 1. So you could get 0.75 helmet, 0.25 head in one model and 0.75 hair dryer, with 0.25 distributed over the other 89 classes - and you can’t tell whether the right answer is helmet or hair dryer.