Post integer quantization error of custom model

inaki_bravo_aramburu · February 27, 2024, 4:16am

Hello there!
I am trying to quantize a custom model for image classification. The idea is creating a model with two branches: early exit and main. Depending on the outcome of the early exit branch, the model should either take that as the output or go back to the backbone for a final prediction. I have trained the model with both outputs and a weighted sum, however, for inference, I have done the following to make sure only one output is given:

def EE_new():

input = Input(shape=[32,32,3])
x = EE.layers[1](input)
x = EE.layers[2](x)
tres = EE.layers[3](x)
x = EE.layers[4](tres)
x = EE.layers[5](x)
x = EE.layers[6](x)
x = EE.layers[7](x)
x = EE.layers[8](x)
x = EE.layers[9]([x, tres])
activation_7  = EE.layers[10](x)
x = EE.layers[11](activation_7)
x = EE.layers[12](x)
x = EE.layers[13](x)
x = EE.layers[14](x)
x = EE.layers[16](x)
activation_7 = EE.layers[15](activation_7)
x = EE.layers[17]([x,activation_7])
common = EE.layers[18](x)

output1 = EE.layers[-4](common)
output1 = EE.layers[-2](output1)

output = ChooseBranchLayer()(common, output1)

return Model(inputs=input, outputs=output)

Where ChooseBrachLayer is:
class ChooseBranchLayer(tf.keras.layers.Layer):
def init(self):
super(ChooseBranchLayer, self).init()

def call(self, common_output, output1):
    top_values, top_indices = tf.math.top_k(output1, k=2)
    condition = (tf.tensordot(top_values, [ 5.39075334, -1.86204806] , axes=1) -3.78367282) > 0
    return tf.cond(condition, lambda: output1, lambda: self.branch2(common_output))

def branch2(self, common_output):

    x = EE.layers[19](common_output)
    x = EE.layers[20](x)
    x = EE.layers[21](x)
    x = EE.layers[22](x)
    x = EE.layers[24](x)
    common = EE.layers[23](common_output)
    x = EE.layers[25]([x, common])

    activation_11 = EE.layers[26](x)
    x = EE.layers[27](activation_11)
    x = EE.layers[28](x)
    x = EE.layers[29](x)
    x = EE.layers[30](x)
    x = EE.layers[32](x)
    activation_11 = EE.layers[31](activation_11)
    x = EE.layers[33]([x, activation_11])

    x = EE.layers[34](x)
    x = EE.layers[35](x)
    x = EE.layers[37](x)
    output2 = EE.layers[39](x)

    return output2

This model works fine on TensorFlow. Conversion to tflite gives no problems as well. However, after providing a representative dataset to try post training integer conversion, I get the following error when calling the invoke method:
RuntimeError: tensorflow/lite/kernels/conv.cc:374 affine_quantization->zero_point->data[i] != 0 (-11 != 0)Node number 32 (CONV_2D) failed to prepare.Node number 27 (IF) failed to prepare.

Dynamic range quant works fine as well but I would love to get integer only quantization.

Thank you very much!!

Tim_Wolfe · February 28, 2024, 3:36am

The error you’re encountering during integer quantization is likely due to an issue with the affine quantization parameters for a CONV_2D layer, where the zero point is not set correctly. To address this, ensure your data is representative, check for compatibility issues, consider using quantization-aware training, and update TensorFlow to the latest version. If the problem persists, incrementally quantize the model to isolate the issue and consult TensorFlow community resources for additional support.