I have simple fully connected model with following architecture:
Input(10) → Dense(10) → Dense(10) → Dense(10) → Dense(10) → Dense(1)
I performed integer 8 QAT on it, and tflite inference.
I want to perform QAT with int8 but with some changes to inference.
For example current 8bit inference looks like this:
And i want to change it to this:
If i do inference with numpy that looks like second image accuracy drops so i want to do QAT that takes all those changes into account, am i able to do it?