Custom Object Detection Model with Tensorflow

I’m training a custom model for my project. I’m working with this Colab. But I have some errors when i tried to create record files. I tried delete this files, but when i deleted it the error continue with other files. I thinked the error is related with max file count or so long file name, but I can’t solve it.

My Colab Code:

!python3 create_csv.py
!python3 create_tfrecord.py --csv_input=images/train_labels.csv --labelmap=labelmap.txt --image_dir=images/train --output_path=train.tfrecord
!python3 create_tfrecord.py --csv_input=images/validation_labels.csv --labelmap=labelmap.txt --image_dir=images/validation --output_path=val.tfrecord

create_tfrecord.py :

# Script to create TFRecord files from train and test dataset folders
# Originally from GitHub user datitran: https://github.com/datitran/raccoon_dataset/blob/master/generate_tfrecord.py

"""
Usage:
  # From tensorflow/models/
  # Create train data:
  python generate_tfrecord.py --csv_input=images/train_labels.csv --image_dir=images/train --output_path=train.record

  # Create test data:
  python generate_tfrecord.py --csv_input=images/test_labels.csv  --image_dir=images/test --output_path=test.record
"""
from __future__ import division
from __future__ import print_function
from __future__ import absolute_import

import os
import io
import pandas as pd

from tensorflow.python.framework.versions import VERSION
if VERSION >= "2.0.0a0":
    import tensorflow.compat.v1 as tf
else:
    import tensorflow as tf

from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict

flags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('labelmap', '', 'Path to the labelmap file')
flags.DEFINE_string('image_dir', '', 'Path to the image directory')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS

def split(df, group):
    data = namedtuple('data', ['filename', 'object'])
    gb = df.groupby(group)
    return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]


def create_tf_example(group, path):
    with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size

    filename = group.filename.encode('utf8')
    image_format = b'jpg'
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []

    labels = []
    with open(FLAGS.labelmap, 'r') as f:
        labels = [line.strip() for line in f.readlines()]

    for index, row in group.object.iterrows():
        xmins.append(row['xmin'] / width)
        xmaxs.append(row['xmax'] / width)
        ymins.append(row['ymin'] / height)
        ymaxs.append(row['ymax'] / height)
        classes_text.append(row['class'].encode('utf8'))
        classes.append(int(labels.index(row['class'])+1))

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example


def main(_):
    # Load and prepare data
    writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
    path = os.path.join(os.getcwd(), FLAGS.image_dir)
    examples = pd.read_csv(FLAGS.csv_input)

    # Create TFRecord files
    grouped = split(examples, 'filename')
    for group in grouped:
        tf_example = create_tf_example(group, path)
        writer.write(tf_example.SerializeToString())

    writer.close()
    output_path = os.path.join(os.getcwd(), FLAGS.output_path)
    print('Successfully created the TFRecords: {}'.format(output_path))

    # Create labelmap.pbtxt file
    path_to_labeltxt = os.path.join(os.getcwd(), FLAGS.labelmap)
    with open(path_to_labeltxt, 'r') as f:
        labels = [line.strip() for line in f.readlines()]
    
    path_to_labelpbtxt = os.path.join(os.getcwd(), 'labelmap.pbtxt')
    with open(path_to_labelpbtxt,'w') as f:
        for i, label in enumerate(labels):
            f.write('item {\n' +
                    '  id: %d\n' % (i + 1) +
                    '  name: \'%s\'\n' % label +
                    '}\n' +
                    '\n')

if __name__ == '__main__':
    tf.app.run()

Error:

Successfully converted xml to csv.
Successfully converted xml to csv.
Traceback (most recent call last):
  File "/content/create_tfrecord.py", line 120, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/platform/app.py", line 36, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/content/create_tfrecord.py", line 98, in main
    tf_example = create_tf_example(group, path)
  File "/content/create_tfrecord.py", line 46, in create_tf_example
    encoded_jpg = fid.read()
  File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/lib/io/file_io.py", line 114, in read
    self._preread_check()
  File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/lib/io/file_io.py", line 76, in _preread_check
    self._read_buf = _pywrap_file_io.BufferedInputStream(
tensorflow.python.framework.errors_impl.NotFoundError: /content/images/train/cone_de_signalisation_jpg.rf.64d539df00fc20d8889f93ec8b9c3763.jpg; No such file or directory
Traceback (most recent call last):
  File "/content/create_tfrecord.py", line 120, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/platform/app.py", line 36, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/content/create_tfrecord.py", line 98, in main
    tf_example = create_tf_example(group, path)
  File "/content/create_tfrecord.py", line 46, in create_tf_example
    encoded_jpg = fid.read()
  File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/lib/io/file_io.py", line 114, in read
    self._preread_check()
  File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/lib/io/file_io.py", line 76, in _preread_check
    self._read_buf = _pywrap_file_io.BufferedInputStream(
tensorflow.python.framework.errors_impl.NotFoundError: /content/images/validation/cone_de_signalisation_jpg.rf.7ccceaae115376af761d854d34e24397.jpg; No such file or directory

Hi @floodinator ,

If your data is in XML format, you can convert it to COCO format using a tool such as Roboflow, etc… Once your data is in COCO format, you can use the following script to convert it to TFRecords

I think this will be easy way to do rather than debugging above error.

I hope this helps!

Thanks

Hi, I solved the issue with change the file names. Thanks for your answer.

1 Like

I am having a similar issue while training a model and I am using the same colab file which you used. But I am getting an error in create_csv.py
The error is as follows:

Traceback (most recent call last):
  File "/content/create_csv.py", line 36, in <module>
    main()
  File "/content/create_csv.py", line 32, in main
    xml_df = xml_to_csv(image_path)
  File "/content/create_csv.py", line 19, in xml_to_csv
    int(member[4][0].text),
IndexError: child index out of range

Can you tell me how to fix this problem?

Hi,

I think you have a problem with read XML files, your error related to not found 5th element, so you can check your XML files to they’re OK.

Hello, I solved this issue by directly converting my images into tfrecords on roboflow. I think the problem occured due to my class names. The class name I had given were “oil spot” and instead it should have been “oil_spot”. Hence there was an error in reading the xml files.

Thanks.

What do you mean by this sir? How and where to change filenames?

I changed the name of train, test and validate image files. Also if you’re using Roboflow, you can change the download type.(Yolov5, yolov8, Coco, json etc.). It’s more basically than change filenames.