While doing quantization, Is it possible to specify the scale and zero point to tensorflow int8 kernel?

I want to convert float32 model to 8-bit integer by using post training quantization. And follow the webs page and use tflite convertor to do that. But some operator may cause error, because it does not comply with the restriction of int8 spec

In case of dense layer:
if dense layer with no bias or zero bias, it will cause error. The message shows below

tensorflow/lite/kernels/kernel_util.cc:106 std::abs(input_product_scale - bias_scale) <= 1e-6 * std::min(input_product_scale, bias_scale)

I found the bias scale is 0, so, that error occur. I need to specified the bias scale to avoid that problem.

In case of slice or stride_slice:
If I concatenate multiple stride_slice tensor. The scale of all concatenated elements has different scale.

Thus, i want to specified the scale and zero point to make sure the model is comply with int8 spec.
Is there any API supported to do that? thanks!


Currently we do not support the manual scale setting API, we’re relying on automatic representative dataset calibrator. I believe you can try to navigate the flatbuffer and modifying some constants, although this may not guarantee the model to work properly with interpreter afterwards.
This looks like an unintended behavior for TF team as well, if you can provide the model PTQ code it would be easier for the team to investigate. Thanks!

1 Like


Are there any updates on setting scale and zero_point for input and output using post training quantization?

As far as I understand the automatic representative dataset calibrator sets scale and zero_point so that the whole value range of the representative dataset can be converted. But I would like to set a scale and zero_point so I have some buffer for higher amplitudes in my signals than my dataset presents.



Although I think you did make a nice suggestion, we do not support the manual setting of scale and zero_point, and we do not have plans to support it yet.