How to load model for deployment properly?

I am deploying a model after training. But in the flask, I am doing it like below,

def seg_model():
    global model
    json_file = open("modelJ.json", 'r')
    loaded_model_json = json_file.read()
    json_file.close()
    model = model_from_json(loaded_model_json)
    model.load_weights("modelWeight.h5")
    
    
if __name__ == '__main__':
    seg_model() 
    app.run(debug=True)

When it’s starting the server it takes time which I think is ok. But further when I call the model.predict for the first time it takes time for loading like below,

2021-12-05 12:45:08.336852: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-12-05 12:45:13.575012: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-12-05 12:45:32.260849: I tensorflow/core/platform/windows/subprocess.cc:308] SubProcess ended with return code: 0

2021-12-05 12:45:32.740390: I tensorflow/core/platform/windows/subprocess.cc:308] SubProcess ended with return code: 0

2021-12-05 12:45:33.551634: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-12-05 12:45:43.796617: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll

How can I overcome this loading time when predicting for the first time?

@MSI Welcome to Tesnorflow Forum!

Below are some of the approached you can try. Let us know if any of the below approach works well for your usecase:

  • Avoid defining seg_model() as a function. Instead, load the model outside any function and make it globally accessible. This ensures the model is loaded only once, improving server startup performance.

  • The JSON/HDF5 approach is discouraged for production due to potential compatibility issues.

  • Save your trained model in the TensorFlow SavedModel format, which is self-contained and widely supported. Use model.save('path/to/saved_model') during training to export the model in this format.

  • For large-scale deployments or complex models, consider TensorFlow Serving. It offers various benefits like model versioning, scalability, and efficient serving.
    If Required, Refer to the TensorFlow Serving documentation for details: Serving Models  |  TFX  |  TensorFlow

Let us know what work well for you.