Deploying TensorFlow Vision Models in Hugging Face with TF Serving

Sayak_Paul · July 25, 2022, 4:33pm

Serving image models can be hard. One major problem is sending raw images in the request payloads. Then there’s a problem of training/serving skew.

In my latest blog post, I show how to locally deploy a ViT (Base-16) from Transformers that takes care of the above issues:

We compress the image as a bae64 encoded string, thereby reducing the size of the payload considerably.
We embed the preprcoessing and postprocessing ops within the serving model to reduce training/serving discrepancy.

Next up, we’ll learn how to scale these kinds of deployments with Docker and Kubernetes. If you’re like me, a fan of true serverless infra, there will be a piece on doing this stuff with Vertex AI too. Stay tuned for that