Recently I have been studying quantization in tf-lite. I found that “Quantization” appears at several places in the Tensorflow code base.
The first one is under “tensorflow\compiler\mlir\lite\quantization”. Per my understanding, this is the “new” way to quantize the TensorFlow model to tflite.
The second one is under “tensorflow\lite\tools\optimize”. This might be the “old” quantization and it would be deprecated in the future (per my understanding).
And, the third one I can find is under “tensorflow\lite\toco”. Looks like this one can be built to a command line tool, and quantization can be performed then. While the 1st and 2nd can be integrated into Python scripts.
My question is: What are the relationships for the tf-lite quantization that appears at these three places? Which one will be used most frequently? And, do they hold the same mechanism in essential?
I appreciate it if anyone can provide any information.