It is seen that the Batchnorm layer appears as Mul and Add nodes after conversion; However if the Mul and Add functionalities are replicated using gamma, beta and input tensors, there seems to be a discrepancy for quantized models. Is there any additional computation apart from these two that is causing this issue?
The FP32 model seems to work just as fine when all parameters are extracted and used as Batchnorm layer , the same is not true for INT8/UINT8 models.
1 Like