I just read “Quantization Debugger” section in TF web page.
The feature looks cool, but I have something unclear.
The page says that “The
RMSE / scale is close to
1 / sqrt(12) (~ 0.289) when quantized distribution is similar to the original float distribution, indicating a good quantized model. The larger the value is, it’s more likely for the layer not being quantized well.”
(from Inspecting Quantization Errors with Quantization Debugger | TensorFlow Lite)
I don’t understand how the number (1 / sqrt(12)) just popped up. My intuition says that why not zero error is the best? Can anyone help me to understand how the number is derived?
Hi @HyukJin_Jeong ,
The value of 1/sqrt(12) that you see in the TensorFlow documentation is related to the expected root mean square error (RMSE) of a uniformly distributed random variable. It comes from the mathematical properties of the uniform distribution and can be derived as follows:
The uniform distribution over the range [a, b] has a probability density function given by:
f(x) = 1 / (b-a) if a ≤ x ≤ b
The mean of this distribution is (a + b)/2, and the variance is (b - a)2 / 12. The root mean square error (RMSE) is the square root of the variance, which is (b - a) / sqrt(12).
When we convert a floating-point number to a fixed-point representation, we are essentially mapping a continuous range of values onto a discrete set of values. Ideally, the mapping should be such that the quantized values are evenly distributed within the range of the original floating-point values. If the mapping is uniform, then we can expect the quantization error to have a distribution that is similar to that of a uniformly distributed random variable.
In this case, we can use the above formula to estimate the expected RMSE of the quantization error. Specifically, the expected RMSE of the quantization error is equal to the range of the original floating-point values divided by sqrt(12). Given that the quantization error is distributed similarly to a randomly distributed variable with a scale close to 1 or sqrt(12), the quantized model is likely to be accurate.
It’s worth noting that while zero error would be ideal, it’s not always achievable in practice due to limitations in the precision of fixed-point representations. Therefore, we aim to minimize the quantization error as much as possible while also ensuring that the error is distributed uniformly and is not biased towards certain values.
Please let me know if it helps you to understand.
@Laxma_Reddy_Patlolla Thanks for your answer. It is really helpful to understand the document.
So, the approach assumes that a layer is well-quantized if the quantization error is uniformly distributed within a scale (to avoid bias towards certain values).