Specify precision in TFLite models

Wenjie_Lu · July 13, 2021, 6:31pm

When convert FP TF models to TFlite models, how can I specify the precision of each operations? I understand how to specify the integer precision of weights and activations, but how can I get information (and set) how each operation is computed? For example, if I have an element-wise addition, how do I know if the addition is computed in 8, 16, or 32bits?

As a practical example, say I need to compute z=x+y for a residual connection. x and y are both 8 bits tensors from previous conv layers. How can I compute x+y in 16 bits? It seems to me that TFlite hasn’t offered such flexibility?

Laxma_Reddy_Patlolla · January 2, 2024, 10:55pm

Hi @Wenjie_Lu ,

May be below details help you to understand:

To compute z=x+y in 16 bits while x and y are 8-bit tensors:
- QAT: Inject quantization nodes during training for the addition operation, potentially guiding the model to learn with 16-bit precision for this part.
- Custom Operation: Create a custom 16-bit addition operation and register it with the interpreter.
- CPU Delegation: Delegate the addition operation to the CPU, ensuring it’s performed in floating-point precision.

I Hope it helps you.

Thanks.