Dimension mismatch in TF lite quantized model

I implemented several different tutorials on how to utilize a quantized model on tensorflow lite. The relevant part of the code is:

    audio_windowed, _, audio_original_length = get_audio_input(waveform, sr, overlap_len, hop_size)

    # Load the quantized model
    interpreter = tf.lite.Interpreter(model_path='quantized_model.tflite')
    interpreter.allocate_tensors()

    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()

    awt = audio_windowed.numpy().transpose((2, 1, 0))
    awe = np.expand_dims(audio_windowed,0)

    input_shape = input_details[0]['shape']
    input_tensor= np.array(audio_windowed.numpy(), dtype=np.float32)
    
    input_index = interpreter.get_input_details()[0]["index"]

    interpreter.set_tensor(0, input_tensor)

where audio_windowed has shape (11, 43844, 1).
However, I get the following error

ValueError: Cannot set tensor: Dimension mismatch. Got 11 but expected 1 for dimension 2 of input 0.

For debugging purposes:

    print("audio_windowed", audio_windowed.shape)
    print("awt", awt.shape)
    print("input_tensor", input_tensor.shape)
    print("input_index", input_index)
    print("input_shape", input_shape)

returns

audio_windowed (11, 43844, 1)
awt (1, 43844, 11)
input_tensor (11, 43844, 1)
input_index 0
input_shape [    1 43844     1]

The issue you’re encountering with the TensorFlow Lite model is related to a mismatch in the dimensions of the input tensor. Your audio_windowed tensor has the shape (11, 43844, 1), but the expected input shape for your model is [1, 43844, 1]. This indicates that your model is expecting a single instance with dimensions 43844 x 1, but you are providing 11 instances.

Here’s how you can resolve this:

  1. Reshape Input Tensor: You need to reshape your input tensor to match the model’s expected input shape. Since your model expects a single instance, you should select one instance from your audio_windowed tensor or modify your data preparation pipeline to produce a single instance with the correct shape.
  2. Batch Processing: If your intention is to process all 11 instances, you need to do it one at a time (since your model’s input shape suggests it can only handle one instance at a time). You can loop through your instances and process them individually.

Here’s a revised version of your code snippet to handle one instance at a time:

pythonCopy code

# Load the quantized model
interpreter = tf.lite.Interpreter(model_path='quantized_model.tflite')
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
input_index = input_details[0]["index"]

# Process each instance in audio_windowed
for i in range(audio_windowed.shape[0]):
    input_tensor = audio_windowed[i].numpy().reshape(input_details[0]['shape'])
    interpreter.set_tensor(input_index, input_tensor)
    interpreter.invoke()
    # get the output and process it as needed
    output_data = interpreter.get_tensor(output_details[0]['index'])
    # ... process output_data ...

In this code, audio_windowed[i].numpy().reshape(input_details[0]['shape']) reshapes each instance to the required input shape of the model. This assumes that each instance in audio_windowed can be reshaped to [1, 43844, 1]. If that’s not the case, you’ll need to adjust your data preparation process accordingly.

Thanks for your solution, I will test it within the next days.
The non-quantized model runs on an input of shape (11, 43844, 1), thus I would expect the quantized model to run on the same type of input, I really do not understand why the input of the model is resized.