Reducing XLA AOT compiled model size


I am using XLA AOT to compile my model with the aim of running the model on bare metal. The compiled model is however much larger in size than the uncompiled. I have tried running some size optimisations using the [Graph Transformation Tool]( such as quantize_weights which reduces the size of the un-compiled model but when compiled, the resulting model is the same as the original, non-optimised version:

Is there any advice on how to reduce the size of the compiled model? Or why the Graph Transformation Tool optimisations don’t work?


do we have now full support for quantized models+aot?

as this ticket is still open since 2017:

We have also:

Hi Bhack,
Thanks for the reply’s.
Do you know if there are any other methods on reducing the size of the compiled model?

I don’t know if we have a specific guide for XLA.
But if you are working with a constrained device we have a guide for TF lite: