Custom quantization aware training with lround during int8 multiplications

I have simple fully connected model with following architecture:
Input(10) → Dense(10) → Dense(10) → Dense(10) → Dense(10) → Dense(1)
I performed integer 8 QAT on it, and tflite inference.

I want to perform QAT with int8 but with some changes to inference.

For example current 8bit inference looks like this:

And i want to change it to this:

If i do inference with numpy that looks like second image accuracy drops so i want to do QAT that takes all those changes into account, am i able to do it?

Thank you