Deploying TensorFlow Vision Models in Hugging Face with TF Serving

Serving image models can be hard. One major problem is sending raw images in the request payloads. Then there’s a problem of training/serving skew.

In my latest blog post, I show how to locally deploy a ViT (Base-16) from :hugs: Transformers that takes care of the above issues:

  • We compress the image as a bae64 encoded string, thereby reducing the size of the payload considerably.
  • We embed the preprcoessing and postprocessing ops within the serving model to reduce training/serving discrepancy.

Next up, we’ll learn how to scale these kinds of deployments with Docker and Kubernetes. If you’re like me, a fan of true serverless infra, there will be a piece on doing this stuff with Vertex AI too. Stay tuned for that :slight_smile:

4 Likes