TF 2.7.1 is 50% less load than TFL 2.7.0

I thought I would ask and maybe rather than avail wheels I should compile TFL 2.7.1 but I am running a CRNN using tf.lite.Interpreter on a Pi0-2 uses approx 40% of a single core with near 200mb.

Using tflite.Interpreter surprised though as yeah memory drops to about 90mb but load goes up to 60% on a single core.

[edit] it will not let me upload screenshots of the load so you will just have to take my word for it.

Thats the same model but using the tflite method from full TF and the standalone TFL from PIP is that right?
I guess I should get round to compiling a 2.7.1 TFL if its any different but thought I would ask.

Please share standalone code or colab gist to replicate your issue? Thank you