Quantizing the models using tflite

saras26 · January 3, 2024, 11:33pm

Hi Everyone, I am trying to quantize my model to 4-bit using post-training quantization, I see tflite supports int 8 bits and float 16 bit. How can do post-training quantization to convert my float 32-bit model to int 4-bit model . Is it possible ? Kindly provide your views

Regards
Saras

Kiran_Sai_Ramineni · January 4, 2024, 6:50am

Hi @saras26, Currently there is no support for int 4 quantization. Only int8, float16 are minimum quantization supported. Thank You.

saras26 · January 4, 2024, 3:51pm

Thank you so much for the clarification.