Recently I have been diving into the implementations of quantization in TF-lite. I found that there are two places in the source code that could “quantize” trained models, but I could not tell what are the differences between them, and what are the use scenarios for them. Once I’ve obtained a trained model (Tensorflow model, a *.pb file) and want to convert it to *.tflite model with enabling quantization, which of the following two source code will be used?
The two source file for quantizing models are:
- tensorflow\compiler\mlir\lite\quantization\lite\quantize_model.cc, and
I appreciate it if someone could help me to figure it out. Thanks!
P.S.: Right now after tracing some code, my understanding is that the quantization in “mlir” would be the “new” quantization and now it is experimental, while the other in “optimize” might be the “conventional” way to perform quantization.
Please correct me if I am wrong. Thanks!