Tflite inference on a single image

John_J_Watson · February 18, 2022, 3:21am

I have gone through the image classifucation example with tflite model maker: Классификация изображений с помощью TensorFlow Lite Model Maker and I have generated the custom model.

So now, I want to run an inference on this model, and when I look at the docs: Image classification with TensorFlow Lite Model Maker it says the image should be 224,224,3 and in the range [0,1]

The 224,224,3 is fine, but I am puzzled by the [0,1] normalisation. Currently, I do this:

def preprocess_image(image_path, input_size):
    """Preprocess the input image to feed to the TFLite model"""
    img = tf.io.read_file(image_path)
    img = tf.io.decode_image(img, channels=3)
    img = tf.image.convert_image_dtype(img, tf.uint8)
    print('min max img value',tf.reduce_min(img),tf.reduce_max(img))
    original_image = img
    resized_img = tf.image.resize(img, input_size)
    resized_img = resized_img[tf.newaxis, :]
    return resized_img, original_image

as the preprocessing step, but then my outputs are in the [0-255] uint8 type. However, I was hoping to see these confidence scores in the [0-1] range.

Am I missing something here?

George_Soloupis · February 18, 2022, 5:35am

Hi @John_J_Watson ,

I am a little confused. You are writing about pre-processing but in the end you mention confidence scores.
You can normalize the values by dividing the image pixel RGB values by 255.

Best

John_J_Watson · February 20, 2022, 11:35am

@George_Soloupis Thank for for your comment. I apologize - my question was not very clear. So, what I did was follow the tutorial for image clasification and produced a tflite model from this. When I run inference on a single image (with preprocessing copied from tflite webpages), I get result, an array (of dim=number_of_classes) with values within the range 0,256 (the sum of the array is always 256. I am somewhat perplexed as to why this is and I am trying to understand a bit more.

To this end, my first thought was that maybe I need to somehow normalise the input image (I know, now, this does not make sense).

When I try dividing the result array contents by 256 it kinda makes sense as a probability score - but I am unsure. I am unsure because, there is a small variantion between these scores and the ones output by perdict_top_k.

I hope my question is a bit more clearer now.

George_Soloupis · February 21, 2022, 6:12am

@John_J_Watson If for training you have used a preprocessing step, then for inference even for a single image you have to use the same. But check first what model has been used as a backbone for training. By checking this you will be able to retrieve the preprocessing steps.

John_J_Watson · February 21, 2022, 7:11am

@George_Soloupis thank you again for taking the time to reply to this.

So, I have literally just followed the official tflite classification tutorial to generate the model.
The tutorial simply loads data from folders from what I understand. The default model is efficientnet0. It says the following on the page:

Preprocess the raw input data. Currently, preprocessing steps including normalizing the value of each image pixel to model input scale and resizing it to model input size. EfficientNet-Lite0 have the input scale [0, 1] and the input image size [224, 224, 3].

Now, I am unsure how to scale the input in the 0-1 range mentioned here ← hence we my original question since my input seems to be in 0-255 range.

So, I either do this:

def preprocess_image(image_path, input_size):
    """Preprocess the input image to feed to the TFLite model"""
    img = tf.io.read_file(image_path)
    img = tf.io.decode_image(img, channels=3)
    img = tf.image.convert_image_dtype(img, tf.uint8)
    print('min max img value',tf.reduce_min(img),tf.reduce_max(img))
    original_image = img
    resized_img = tf.image.resize(img, input_size)
    resized_img = resized_img[tf.newaxis, :]
    return resized_img, original_image

or I do:

def preprocess_image(image_path, input_size):
    """Preprocess the input image to feed to the TFLite model"""
    img = tf.io.read_file(image_path)
    img = tf.io.decode_image(img, channels=3)
    img = tf.cast(img, dtype=tf.float32) / tf.constant(255, dtype=tf.float32) 
    print('min max img value',tf.reduce_min(img),tf.reduce_max(img))
    original_image = img
    resized_img = tf.image.resize(img, input_size)
    resized_img = resized_img[tf.newaxis, :]
    return resized_img, original_image

the difference is:

img = tf.cast(img, dtype=tf.float32) / tf.constant(255, dtype=tf.float32)

to norm the values between 0-1. But then the class values are totally wrong as in it makes the wrong preds.

However, if I do NOT norm the values to 01, the results look perfect (although the results are in the range [0-255] and taking the argmin of this gives me the correct class. So I think I am able to conlude from these little experiments that the normalisation line is not required. This however seems to go against the docs.

So, I am puzzled what is going on and I am looking for an official inference example from tflite model maker classification

I am now wondering if the inference interpretter automatically does some normalisation I am unaware of and that I probably do not need to do any of these normalisations.

When I look at some randoom posts such as:

I see that the inference output is in the 0-255 range for a single image, which is the case for me.

The other thought I have is that, since I quantise the model via: model.export(export_dir='.') I think the output range is between 0-255. I got this idea from here: examples/EXPLORE_THE_CODE.md at master · tensorflow/examples · GitHub

So, yea, just a bit confused as to what the correct thing is to do here.

Thank you.

Kzyh · February 21, 2022, 7:50am

Seems like normalization is already in the model, so you should just resize image.

George_Soloupis · February 21, 2022, 8:02am

Check here to see the documentation of the Model Maker Image Classification

github.com

tensorflow/examples/blob/56524742e739278cc52e2fcfc978e81d64cfcbd3/tensorflow_examples/lite/model_maker/core/task/image_preprocessing.py#L27

      
        
            from __future__ import division
            from __future__ import print_function
            
            
import tensorflow.compat.v2 as tf
            
            

            
IMAGE_SIZE = 224
            CROP_PADDING = 32
            
            

            
class Preprocessor(object):
              """Preprocessing for image classification."""
            
            
  def __init__(self,
                           input_shape,
                           num_classes,
                           mean_rgb,
                           stddev_rgb,
                           use_augmentation=False):
                self.input_shape = input_shape
                self.num_classes = num_classes

It seems that it is using (image-mean) / std.
You can follow the instructions here to install and use the specific class.

I am tagging also @Yuqi_Li as they are mentioned in the code.

Best

John_J_Watson · February 21, 2022, 8:45am

@George_Soloupis so it looks like I do not need to do any normalisation right (in inference mode, when loading the tflite quantised model)? as @Kzyh points out, just resize and feed it it?

The natural question now is:

should I pass the image through this class? or just resize and send the image to tflite.model (the one generated by model.export(export_dir='.'))

Kzyh · February 21, 2022, 8:58am

I think so. You can save tflite model file, then open it in Netron and check if there is any preprocessing.

John_J_Watson · February 21, 2022, 9:19am

@Kzyh thanks for that Netron tip. So when I load the model I see the first three steps as:

- input 1_0
- Quantize
- Mul B=127
- Add B=-128
- Conv...
...

The Mul and Add seems like some form of normalization? Is this right?

Kzyh · February 21, 2022, 10:00am

Yeah, seems it is normalization.

John_J_Watson · February 21, 2022, 2:00pm

@Kzyh it seems a bit bizzare. Let me explain. So, when I train the model and do tokk_predicts like so on a single image in the test set:

    print(model.evaluate_tflite('./src_classify/model.tflite', test_data))
    predicts = model.predict_top_k(test_data)

I get (it is a 2 class problem):

[[('class0', 0.91340613)]]

So, now I export the model and run the same image and I get:

[241  15]

>>> 241/255
0.9450980392156862

so, the scores dont match up… really bizzare.

My input prep is:

def preprocess_image(image_path, input_size):
    """Preprocess the input image to feed to the TFLite model"""
    img = tf.io.read_file(image_path)
    img = tf.io.decode_image(img, channels=3)
    img= tf.cast(img , tf.float32) * (1. / 255) #keep on, divide result by 256
    print('min max img value',tf.reduce_min(img),tf.reduce_max(img))
    original_image = img
    resized_img = tf.image.resize(img, input_size)
    resized_img = resized_img[tf.newaxis, :]
    return resized_img, original_image

what could be the problem

Kzyh · February 21, 2022, 4:24pm

Not sure, but can you try moving this after resizing image?

* (1. / 255)