QAT training: convert input/output to 8 bits instead of float32


I applied the QAT based training following TensorFlow guidelines (Quantization aware training comprehensive guide  |  TensorFlow Model Optimization)

After that, I converted the quantized model into tflite format following the TF guideline (see below).

with tfmot.quantization.keras.quantize_scope():
  quant_aware_model= tf.keras.models.load_model(keras_qat_model_file)

converter = tf.lite.TFLiteConverter.from_keras_model(quant_aware_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
quantized_tflite_model = converter.convert()
with open(self.output_model_path, 'wb') as f:

After converting the quantized model into tflite format, the output model size is, as expected, 4X smaller than the original model, due to the fact that the parameters are in 8 bits instead of float 32 bits.

However, the input and outputs are still in float 32 bits. What is the recommended way to convert the input and output in 8 bits instead of float 32 bits ?
Do I need to follow the same procedure than the one described in post-training quantization ?
(Post-training integer quantization  |  TensorFlow Lite)

In TFlite converter the default mode of quantization is float32. If you want to to get int8 as input and output please follow Post-training integer quantization. Thank you

Thanks for you answer.
For the post-quantization, I was afraid that it will re-estimate the quantization values using the representative dataset and hence the values estimated during the training will be lost. Isn’t it the case ?
This will still preserve the quantization values that we got during the QAT training ?

There will be small optimization in weights and other variables but there will be no significant loss in int8 quantization and it also reduces inference time. Please refer to the benchmark of different models w.r.t to different quantization as shown in the below screenshot

ok, will try.

@jean_dspo @chunduriv
I have the same doubt about converting inputs/outputs to int8. So we should specify the input/outputs to int8 using tflite converter and use the representative dataset as well right?


Yes, we need to specify input and output types as int8 and also provide the representative dataset.

For more information please refer to the above mentioned reference.

Thank you!

1 Like

Thank you. @chunduriv