I want to convert float32 model to 8-bit integer by using post training quantization. And follow the webs page and use tflite convertor to do that. But some operator may cause error, because it does not comply with the restriction of int8 spec
In case of dense layer:
if dense layer with no bias or zero bias, it will cause error. The message shows below
Currently we do not support the manual scale setting API, we’re relying on automatic representative dataset calibrator. I believe you can try to navigate the flatbuffer and modifying some constants, although this may not guarantee the model to work properly with interpreter afterwards.
This looks like an unintended behavior for TF team as well, if you can provide the model PTQ code it would be easier for the team to investigate. Thanks!
Are there any updates on setting scale and zero_point for input and output using post training quantization?
As far as I understand the automatic representative dataset calibrator sets scale and zero_point so that the whole value range of the representative dataset can be converted. But I would like to set a scale and zero_point so I have some buffer for higher amplitudes in my signals than my dataset presents.
Although I think you did make a nice suggestion, we do not support the manual setting of scale and zero_point, and we do not have plans to support it yet.