Trouble Translating the Output of an ImageSegmenter

I’m attempting to use the Tensorflow Lite Task Library in Android.
I have tried several of the Task Library Segmentation models available on Tensorflow Hub.

The output of these models all seem to be a Confidence mask which are greyscale bitmaps for each class where the pixel value is supposed to be the probability that the pixels is of that class.
However, the documentation indicates that the values should be a FLOAT32 from 1 to 2.
I don’t understand how this is a probability. I would think a probability would be from 0 to 1.

Additionally, I’m not sure that I am reading the values correctly as they seem to be both much larger and smaller than 1 and 2.

I have looked at the Android Example for the ImageSegmenter but the example uses a Segmentation Mask rather than a Confidence Mask which appears to be much easier to translate.

The output of these models all seem to be a Confidence mask which are greyscale bitmaps for each class where the pixel value is supposed to be the probability that the pixels is of that class. However, the documentation indicates that the values should be a FLOAT32 from 1 to 2. I don’t understand how this is a probability. I would think a probability would be from 0 to 1.

HI,

can you share please the link to the documentation you used?
was it this one: https://www.tensorflow.org/lite/inference_with_metadata/task_library/image_segmenter\

I couldn’t find this: “the documentation indicates that the values should be a FLOAT32 from 1 to 2.”

Hello,

There were a mixture of clues that I was putting together. The Tensorflow Hub Page for the model indicates through the output metadata section that the range of the output values are from 1 to 2
https://tfhub.dev/sayakpaul/lite-model/deeplabv3-xception65-ade20k/1/default/2

When I load the model into an interpreter and check the output tensor, it indicates that the dataType is FLOAT32 and the shape is 1 x 129 x 129 x 151

This gave me the indication that it is a Confidence Mask.
However, I have discovered that this is not true. The output is actually a Segmentation Mask.

I am still unsure why the metadata indicates the range of values is from 1 to 2 though.

This is easy to find out

Hi @Sayak_Paul , can you help us here?

Probably the min and max ranges of the output_tensor_metadata field are flawed. The metadata population task was performed in a bulk with a sample code and probably that is where the error stemmed.

If you could take a look at the Colab Notebook associated with the model, you will find how to post-process the model outputs in Python:

1 Like

Thank you for the reply @Sayak_Paul!

I did previously look at the Colab Notebook. Thank you for providing these!

However, in the Colab I was thrown off by the post processing as it appears that the post processing in the notebook is treating the output as a confidence mask across each class
seg_map = tf.argmax(tf.image.resize(raw_prediction, (height, width)), axis=3)

In Android using the Task Library however, this model appears to be a Segmentation mask and no argmax is required.

I was attempting to post process like the python notebook but the results were askew which is what prompted me to come here to the forums.

Is this just my ignorance or does the Colab post process treat the output as a confidence mask?

Is this just my ignorance or does the Colab post process treat the output as a confidence mask?

If you take a look at the Convert to TFLite step of the Colab Notebook, you would notice this comment:

# The preprocessing and the post-processing steps should not be included in the TF Lite model graph 
# because some operations (ArgMax) might not support the delegates. 

This is why we need to treat the results as a confidence mask.

@Sayak_Paul Thank you! I didn’t pay attention to that part🤦‍♂️

1 Like