While doing quantization, Is it possible to specify the scale and zero point to tensorflow int8 kernel?

I want to convert float32 model to 8-bit integer by using post training quantization. And follow the webs page and use tflite convertor to do that. But some operator may cause error, because it does not comply with the restriction of int8 spec

In case of dense layer:
if dense layer with no bias or zero bias, it will cause error. The message shows below

tensorflow/lite/kernels/kernel_util.cc:106 std::abs(input_product_scale - bias_scale) <= 1e-6 * std::min(input_product_scale, bias_scale)

I found the bias scale is 0, so, that error occur. I need to specified the bias scale to avoid that problem.

In case of slice or stride_slice:
If I concatenate multiple stride_slice tensor. The scale of all concatenated elements has different scale.

Thus, i want to specified the scale and zero point to make sure the model is comply with int8 spec.
Is there any API supported to do that? thanks!


Currently we do not support the manual scale setting API, we’re relying on automatic representative dataset calibrator. I believe you can try to navigate the flatbuffer and modifying some constants, although this may not guarantee the model to work properly with interpreter afterwards.
This looks like an unintended behavior for TF team as well, if you can provide the model PTQ code it would be easier for the team to investigate. Thanks!

1 Like