Tensorflow Model Garden tutorial

Hello everyone, I have 2 questions and thank you for your interest.

1- Object detection with Model Garden  |  TensorFlow Core Does this tutorial contain transfer learning? If your answers yes, Can you explain how do you understand that.

2- I just follow this tutorial with different data set which is pothole data.

Validation metrics are very bad. How can I increase validation metrics. I need more than 0.5 AP.
I use 30000 step to train model.

Model and model parameters are:
exp_config = exp_factory.get_exp_config(‘retinanet_resnetfpn_coco’)

batch_size = 32
num_classes = 1
HEIGHT, WIDTH = 640, 640
IMG_SIZE = [HEIGHT, WIDTH, 3]

Backbone config.

exp_config.task.freeze_backbone = True
exp_config.task.annotation_file = ‘’

Model config.

exp_config.task.model.input_size = IMG_SIZE
exp_config.task.model.num_classes = num_classes + 1
exp_config.task.model.detection_generator.tflite_post_processing.max_classes_per_detection = exp_config.task.model.num_classes

Training data config.

exp_config.task.train_data.input_path = train_data_input_path
exp_config.task.train_data.dtype = ‘float32’
exp_config.task.train_data.global_batch_size = batch_size
exp_config.task.train_data.parser.aug_scale_max = 1.0
exp_config.task.train_data.parser.aug_scale_min = 1.0

Validation data config.

exp_config.task.validation_data.input_path = valid_data_input_path
exp_config.task.validation_data.dtype = ‘float32’
exp_config.task.validation_data.global_batch_size = batch_size

Result Metrics are:

eval | step: 30000 | steps/sec: 5.4 | eval time: 18.4 sec | output:
{‘AP’: 0.10134392,
‘AP50’: 0.29438367,
‘AP75’: 0.03953968,
‘APl’: 0.550495,
‘APm’: 0.26847038,
‘APs’: 0.050420657,
‘ARl’: 0.55,
‘ARm’: 0.36355934,
‘ARmax1’: 0.107679464,
‘ARmax10’: 0.18247078,
‘ARmax100’: 0.21385643,
‘ARs’: 0.17557411,
‘box_loss’: 0.009063716,
‘cls_loss’: 0.60130537,
‘model_loss’: 1.054491,
‘steps_per_second’: 5.423086661586954,
‘total_loss’: 5.3274546,
‘validation_loss’: 5.3274546}

Hi @Taha_Er ,

Yes,it does not contain transfer learning.

It tutorial the checkpoint initialization is missing and if you need to do transfer learning all you need to do just initialize the checkpoint and train the model.

For checkpoints, you can download from here

exp_config = tfm.core.exp_factory.get_exp_config('retinanet_resnetfpn_coco')
!wget "https://storage.googleapis.com/tf_model_garden/vision/resnet/resnet-50-i224.tar.gz"
tar -xzvf resnet-50-i224.tar.gz

Backbone config.

# Backbone Config
exp_config.task.init_checkpoint = "downloaded checkpoint path" 
exp_config.task.freeze_backbone = True
exp_config.task.annotation_file = ‘’

I hope this helps you.

Thanks.

1 Like

@Laxma_Reddy_Patlolla It give me some progress. Now I’m facing with an error which called:

Unable to open table file /content/TransferLearning/model.ckpt-62400.data-00000-of-00001: DATA_LOSS: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

I also share the colab notebook, it may help to figure out problem.

Some update:

Problem caused by path
Path should be like this: “./content/ckpt-33264.data-00000-of-00001”
but I forgot to use “.” in starting point of path.

Now I have new error which is :

restoring or initializing model…

RuntimeError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/tensorflow/python/training/py_checkpoint_reader.py in NewCheckpointReader(filepattern)
91 try:
—> 92 return CheckpointReader(compat.as_bytes(filepattern))
93 # TODO(b/143319754): Remove the RuntimeError casting logic once we resolve the

RuntimeError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./content/ckpt-33264.data-00000-of-00001

During handling of the above exception, another exception occurred:

NotFoundError Traceback (most recent call last)
12 frames
/usr/local/lib/python3.10/dist-packages/tensorflow/python/training/py_checkpoint_reader.py in error_translator(e)
29 'Failed to find any ’
30 ‘matching files for’) in error_message:
—> 31 raise errors_impl.NotFoundError(None, None, error_message)
32 elif ‘Sliced checkpoints are not supported’ in error_message or (
33 'Data type ’

NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./content/ckpt-33264.data-00000-of-00001

Hi @Taha_Er,

For the checkpoint ./content/ckpt-33264 is enought you don’t have to mention .data.

Also please find the gist colab trained on pothole dataset. Also please use the roboflow dataset from your account, as I have did it from my account.

Hope this helps!

Thanks

1 Like

Thank you very much for your answer, your solution fix everything.

1 Like