What is the best way to export TF model to TFLite to work with different input sizes?

Jakub_Gorski · December 4, 2023, 2:40am

I am exporting my tensorflow model to TFLite. I want to be able to run the TFLite model on mobile device GPU with different input shapes.

Should I:
a) Export the model with dynamic input size (1, None, None, 3)
or
b) export the model with any valid input size (1, 256, 256, 3)

In both cases before inference I can then use the interpreter.resize_tensor_input function and it works well locally (in Python on CPU).

However when I use the TFLite benchmark model tool with --use_gpu=true --input_layer=input_1 --input_layer_shape=1, 512, 512, 3, the model created with dynamic input shapes (sometimes) fails.
For example when the model includes tf.concat operation, I get the following error:

INFO: Initialized TensorFlow Lite runtime.
INFO: Created TensorFlow Lite delegate for GPU.
INFO: GPU delegate created.
ERROR: tensorflow/lite/kernels/concatenation.cc:202 t->dims->data[d] != t0->dims->data[d] (0 != 1)
ERROR: Node number 348 (CONCATENATION) failed to prepare.
ERROR: Failed to apply GPU delegate.

Without the --use_gpu flag the app executes successfully. I investigated this, and from my understanding the problem occurs during the interpreter_->ModifyGraphWithDelegate(delegate); call - which invokes the Subgraph::EnsureMemoryAllocations() where there is the AllocateTensors() call.
If (in benchmark_tflite_model.cc) I first invoke interpreter_->ResizeInputTensor with a valid shape, the model created with dynamic input shapes runs without errors.

However I am not sure if my modification is valid. Maybe the option b) is perfectly fine? Is someone with better understanding able to clarify?

Jakub_Gorski · December 8, 2023, 11:06am

Anyone has any insight into this? Is exporting the model with any valid input size fine? I still couldn’t find any best practices about it

BadarJaffer · December 8, 2023, 12:07pm

The issue you’re encountering seems to be related to using TensorFlow Lite GPU delegate with a model that has dynamic input shapes. The TensorFlow Lite GPU delegate might not handle dynamic shapes seamlessly in all cases, leading to errors during the graph modification and tensor allocation process.

In this scenario, there are a few considerations:

Exporting the Model with Dynamic Input Size (1, None, None, 3). This allows for flexibility in input sizes during inference and when using the TensorFlow Lite interpreter, you can resize the input tensor dynamically based on the input size you want to use for a specific inference. However, as you observed, when using the TensorFlow Lite GPU delegate, there might be issues related to tensor allocations and GPU operations, especially when the model involves operations like tf.concat.

2.Exporting the Model with a Fixed Input Size (1, 256, 256, 3). This provides a fixed input size, and you can still use the TensorFlow Lite interpreter’s resize_tensor_input function to adapt the input size during inference. This approach might be more compatible with the TensorFlow Lite GPU delegate, as it eliminates the need for handling dynamic shapes during GPU operations.

Regarding your modification to the benchmark_tflite_model.cc file where you first invoke interpreter_->ResizeInputTensor with a valid shape before running the model, this seems like a workaround to ensure that the GPU delegate handles the dynamic input shapes correctly @Jakub_Gorski

Jakub_Gorski · December 11, 2023, 9:19am

@BadarJaffer thank You for the reply! I was confused, because here:
TFLITE Relocate Tensor Fail · Issue #41807 · tensorflow/tensorflow (github.com)

I understood, that exporting the model with a dummy input size and resizing it later, is not the valid approach. But for me it seemed to work correctly.

About this, I’m not sure that is the case. When I use the TFLite benchmark model tool with --use_gpu=false --use_xnnpack=true --input_layer=input_1 --input_layer_shape=1, 512, 512, 3 , the model created with dynamic input shapes also fails. It only executes correctly with the default delegate. What is more weird, the default delegate ia actually also the XNNPACK.

The difference lies in the order of the execution in the TFLite benchmark tool. When default delegate is used, the tensors are allocated first. When delegate is specified explicitly, first the interpreter_->ModifyGraphWithDelegate(delegate) call is executed.

So I see 3 options:
a) There is a bug in the TFLite benchmark tool (I guess that is unlikely);
b) Exporting the Model with a Fixed Input Size (1, 256, 256, 3) and using the resize_tensor_input function is actually the right approach for my model;
c) My model is not possible to be exported with dynamic input sizes, because the valid way to do it is to export it with (1, None, None, 3) size, but because of the concat operation in my case it is not possible;

So from what @BadarJaffer say I suppose the correct answer is b) - but if anyone has good understanding about this problem please confirm or correct me

BadarJaffer · December 11, 2023, 9:46am

Hi @Jakub_Gorski, let me know if it works and if not, we can dig down deep into your problem. I am just a reply away. Good luck!