Creating tensorflow executable

Hi, I trained a CNN (small UNet) in Tensorflow 2.4 and Python 3.8 and with CUDA support. The model is working well and fast in Python, but I need to build an executalbe program from it, so a PC can execute it WITHOUT having TF and Python installed. The target device runs on Windows 10 and has CUDA 10.2 installed. What is more, the executable program should not exceed a size of ~100MB and needs to support CUDA.
Is this even possible and what is the best way to build such an executable program?

If you need to use CUDA and c++ have you tried to use the Nvidia runtime TensorRT?

Or you can find an example using directly libtensorflow at:
https://medium.com/@reachraktim/using-the-new-tensorflow-2-x-c-api-for-object-detection-inference-ad4b7fd5fecc

/cc @markdaoust do you know if we have any other inference example on GPU and Windows? As we have already another request at:
https://tensorflow-prod.ospodiscourse.com/t/basi-things-to-deploy-the-model-on-windows-c-in-visual-studio/5201

I’m not sure.

There are partial C++ savedmodel instructions here: Using the SavedModel format  |  TensorFlow Core

IDK what the status of the new C saved_model api is: tensorflow/README.md at master · tensorflow/tensorflow · GitHub, but IIUC that just needs the Install TensorFlow for C, but that’s 500MB.

Given your requirements TFLite may be your best bet. It does have GPU support, and is designed for space-limited applications.

This It was requested also in another thread but I suppose that currently we don’t have an Nvidia GPU delegate for this solution:

https://tensorflow-prod.ospodiscourse.com/t/error-when-using-tflite-interpreter-in-flask/4961/38?u=bhack