Getting 300 num_detections while using Tensorflow Serving via REST API

I am getting 300 num_detections while serving my custom resnet152_v1 model through Tensorflow Serving.

The model has been exported with Tensorflow’s with encoded_image_string_tensor as input type.

My model is working fine while using it locally with this script from Tanner Gilbert

Here is my REST API request:

import requests
import json
import tensorflow as tf
import numpy as np
import cv2
from PIL import Image
from io import BytesIO

# Patch the location of gfile
tf.gfile =

# Replace these labels with your actual labels
labels = ['Like_post', 'Like_photo', 'Like_num', 'Like_redky']

path = 'test_images/test_image/58111313621316938095299609.jpg'

img_data =, 'rb').read()
image =
(im_width, im_height) = image.size
image = np.array(image.getdata()).reshape(
    (im_height, im_width, 3)).astype(np.uint8)

# Prepare the JSON request with the signature name
image = image.tolist()
data = {
    "signature_name": "serving_default",
    "instances": [{"input_tensor": image}]  # Adjust the input key based on your model's signature

# Send the inference request to TensorFlow Serving
url = 'http://localhost:8501/v1/models/v1:predict'  # Replace 'model' with the actual model name and version
headers = {"content-type": "application/json"}
response =, data=json.dumps(data), headers=headers)

# Process the response
if response.status_code == 200:
    predictions = response.json()['predictions'][0]
    predictions = np.argmax(predictions)
    predicted_class_idx = np.argmax(predictions)
    predicted_label = labels[predicted_class_idx]
    print("Predicted Label:", predicted_label)
    print("Class Probabilities:", predictions)
    print("Error: Unable to get predictions. Status code:", response.status_code)
    print("Response content:", response.content)

I spent ages trying to debug this. I tried multiple ways to preprocess the image, since I though this was the cause of the issue, but every time I get the same result with 300 num_detections. This is why in this request I am including the preprocessing used in the initial local script.

This is part my output as it is very long due to the amount of detections I am getting:

'num_detections': 300.0, 'raw_detection_boxes': [[0.71807313, 0.049279619, 0.741069555, 0.115024686], [0.711260557, 0.0462932028, 0.743643343, 0.108250089], [0.708367467, 0.0556791946, 0.734233856, 0.112597443], [0.113099433, 0.0460921936, 0.145879179, 0.109014511],...
Predicted Label: Like_post
Class Probabilities: 0

In the end, I think the main reason why the response cannot be processed is due to the 300 num_detections. Please tell me how to solve this issue or if there is another mistake I am making. Thanks!