Can't predict dataset with varying resolution

Hi. After a many attempts of trial and error, I was able to create a model that would train on a dataset with varying resolution images.

All works well, but when I save this model, and later load it, it can’t predict multiple images with different resolution.

A workaround is to load the model before each image, but that’s not ideal at all.

Is there a way to fix this? Or is it a bug?

The error:

tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [1,10,10,3] vs. shape[1] = [1,12,9,3] [Op:ConcatV2] name: concat

Full traceback:

Traceback (most recent call last):
  File "c:\Users\samue\Desktop\test\predicting.py", line 22, in <module>
    predicts = conv_model.predict(dataset)
  File "C:\Users\samue\AppData\Roaming\Python\Python310\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\samue\AppData\Roaming\Python\Python310\site-packages\tensorflow\python\framework\ops.py", line 7186, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [1,10,10,3] vs. shape[1] = [1,12,9,3] [Op:ConcatV2] name: concat

How to reproduce:

I’ve create a minimal reproducible example, with only two images with resolution of [10, 10] and [9, 12], both saved as .png with rgb colorspace.

Running training.py creates a model.h5 model, trained after 1 epoch with just a single Conv2D layer.

folder structure:

/main_folder
--training.py
--predicting.py
--/data
   --001.png
   --002.png

training.py

import cv2, os
import keras
import tensorflow as tf
from keras import layers


strategy = tf.distribute.MirroredStrategy()

with strategy.scope():
    input_layer = keras.Input(shape=(None, None, 3))
    out = layers.Conv2D(3, (3, 3), activation='sigmoid', padding='same')(input_layer)

    conv_model = keras.Model(input_layer, out)
    conv_model.compile(
        optimizer='adam', 
        loss=tf.keras.losses.MeanSquaredError()
    )

conv_model.summary()

path = "data"
data = [cv2.imread(os.path.join(path, f)) / 255 for f in os.listdir(os.path.join(path))]


def data_generator():
    for i in range(len(data)):
        yield data[i], data[i]


dataset = tf.data.Dataset.from_generator(
    data_generator, 
    output_types=(tf.float32, tf.float32), 
    output_shapes=((None, None, 3), (None, None, 3))
).batch(1)

conv_model.fit(
    dataset,
    epochs=1,
    validation_data=dataset
)

conv_model.save('model.h5')

predicting.py

import cv2, os
import keras
import tensorflow as tf


path = "data"
data = [cv2.imread(os.path.join(path, f)) / 255 for f in os.listdir(os.path.join(path))]


def data_generator():
    for i in range(len(data)):
        yield data[i], data[i]


dataset = tf.data.Dataset.from_generator(
    data_generator, 
    output_types=(tf.float32, tf.float32), 
    output_shapes=((None, None, 3), (None, None, 3))
).batch(1)

conv_model = keras.models.load_model('model.h5')
predicts = conv_model.predict(dataset)

for i in predicts:
    print(i)

Resolved at Stackoverflow.

Copy of the answer:

The method model.predict will try to pack its output in one tensor/numpy array. However, the outputs need to have a the same dimensions accross all samples.

In the case of varying dimensions, you could directly iterate on the dataset and call the model on a unique sample:

predicts = [conv_model(x) for x in dataset]

Notice that for a small number of inputs (i.e batch_size=1), it is recommended to use __call__ directly rather than the predict method.