Model Server vs Notebook: Differences in Model input/output shapes

Hey there,

I am super new to Tensorflow so please forgive any missteps.
I have been given 2 versions of a model, the difference between the 2 being the input/output shapes, and I am seeing some differences with how Tensorflow handles predictions between the 2 models.

An excerpt of the input layer shapes:


feature_1 (None, 1) <dtype: 'string'>
feature_2 (None, 1) <dtype: 'string'>
feature_3 (None, 1) <dtype: 'float32'>
feature_4 (None, 1) <dtype: 'string'>
feature_5 (None, 1) <dtype: 'string'>


feature_1 (None,) <dtype: 'string'>
feature_2 (None,) <dtype: 'string'>
feature_3 (None,) <dtype: 'float32'>
feature_4 (None,) <dtype: 'string'>
feature_5 (None,) <dtype: 'string'>

To make a request to each model using .predict() - both versions accept a dictionary of format:

	"<feature_name>": np.array([<value>])
v1_p = v1.predict(dict) # [0.000062629]
v2_p = v2.predict(dict) # [[[0.00062629]]]

However, when using a model server, the format of the request differs:


  "instances": [
  "feature_1": ["a_string"],
  "feature_2": ["a_string"],
  "feature_3": [91.5],
  "feature_4": ["a_string"],
  "feature_5": ["a_string"],


  "instances": [
  "feature_1": "a_string",
  "feature_2": "a_string",
  "feature_3": 91.5,
  "feature_4": "a_string",
  "feature_5": "a_string",

I have been trying to determine why the input changes for the model server but I am missing something (likely obviously fundamental) to better understand.

  • What is the reason for the model server requiring a different input, but tensorflow accepting a numpy array for both?
  • What is the difference between a shape of (None,) and a shape of (None,1)

Google DeepMind Assist]

Welcome to the world of TensorFlow! Your question touches on some nuanced aspects of model serving and input shapes, so it’s a great learning opportunity.

Model Server Input Format Differences

The differences in input requirements between using .predict() in a notebook (or local environment) and a model server (like TensorFlow Serving) often stem from how each environment interprets and handles the data shapes and types.

  1. Local Environment (predict() method): When you use .predict() locally, TensorFlow directly interacts with the numpy arrays you provide. It’s flexible in handling shapes because the local environment can automatically adjust or infer certain dimensions. For example, if your model expects a shape of (None, 1) for its inputs, but you provide an array of shape (None,), TensorFlow can usually broadcast or reshape the data as needed without explicit instruction.
  2. Model Server: Model servers, such as TensorFlow Serving, often require more explicit and structured JSON requests. The “instances” format you mentioned is a standardized way to send requests to the server. Here, the data must be structured in a way that matches the expected input shape of the model precisely because the server might not perform the same level of automatic reshaping or inference that local execution would. This requirement for explicitness helps ensure consistency and clarity in requests, especially when dealing with complex or high-dimensional data.

Differences Between (None,) and (None, 1) Shapes

The shapes (None,) and (None, 1) might look similar but have important distinctions in how TensorFlow interprets the data:

  • (None,): This shape denotes a 1D array or vector for each feature, where None represents an unspecified batch size. This means each feature is expected as a flat array, without explicit second dimension. It’s more flexible for inputs where each feature is just a list of values without needing to be encapsulated in another array.
  • (None, 1): This shape indicates a 2D array for each feature, where the second dimension is explicitly 1. It suggests that each individual feature value should be encapsulated in an array (or list, in the case of JSON). This can be useful when your model treats each feature as a sequence or expects a certain dimensionality for each feature input.

Impact on Model Serving

  • For v1 (with input shape (None, 1)), the model server expects each feature to be encapsulated in an array, aligning with the 2D nature of the input shape.
  • For v2 (with input shape (None,)), the model server expects each feature as a single value or a flat list, reflecting the 1D nature of the input shape.

This difference in expected input format is crucial for ensuring that the data passed to the model matches its expected structure, allowing the model to correctly interpret and process the input features.

Understanding these nuances is key to effectively deploying and interacting with TensorFlow models in different environments. It’s all part of the learning process, so your inquiry is right on track!