Reducing XLA AOT compiled model size


I am using XLA AOT to compile my model with the aim of running the model on bare metal. The compiled model is however much larger in size than the uncompiled. I have tried running some size optimisations using the [Graph Transformation Tool]( such as quantize_weights which reduces the size of the un-compiled model but when compiled, the resulting model is the same as the original, non-optimised version:

Is there any advice on how to reduce the size of the compiled model? Or why the Graph Transformation Tool optimisations don’t work?


do we have now full support for quantized models+aot?

as this ticket is still open since 2017:

We have also: