What are the differences and usages for QuantizeModel under mlir and optimize categories?

huzq85 · July 8, 2022, 1:35am

hi,
Recently I have been diving into the implementations of quantization in TF-lite. I found that there are two places in the source code that could “quantize” trained models, but I could not tell what are the differences between them, and what are the use scenarios for them. Once I’ve obtained a trained model (Tensorflow model, a *.pb file) and want to convert it to *.tflite model with enabling quantization, which of the following two source code will be used?
The two source file for quantizing models are:

tensorflow\compiler\mlir\lite\quantization\lite\quantize_model.cc, and
tensorflow\lite\tools\optimize\quantize_model.cc

I appreciate it if someone could help me to figure it out. Thanks!

P.S.: Right now after tracing some code, my understanding is that the quantization in “mlir” would be the “new” quantization and now it is experimental, while the other in “optimize” might be the “conventional” way to perform quantization.

Please correct me if I am wrong. Thanks!

Amin_Jabari · July 8, 2022, 10:07am

Did you get any solution?

huzq85 · July 8, 2022, 11:55am

Not quite yet but it’s very likely that the quantization in “mlir” is relatively newer than the one in “optimize” but it is experimental.