Model.predict yeilds more predictions than the number of outputs

I’ve created a multi-class image classifier using CNN. I am using the keras module specifically and I am using generators to fit and then predict 4 different classes of images. My test_generator has 394 examples (all four classes combined), but my model.predict yields (6304, 4) predictions.

Here’s the model summary:

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 IP (Conv2D)                 (None, 64, 64, 32)        320       
                                                                 
 Convolution0 (Conv2D)       (None, 64, 64, 64)        18496     
                                                                 
 PL0 (MaxPooling2D)          (None, 32, 32, 64)        0         
                                                                 
 Convolution1 (Conv2D)       (None, 32, 32, 128)       73856     
                                                                 
 PL1 (MaxPooling2D)          (None, 16, 16, 128)       0         
                                                                 
 Convolution2 (Conv2D)       (None, 16, 16, 256)       295168    
                                                                 
 PL2 (MaxPooling2D)          (None, 8, 8, 256)         0         
                                                                 
 FL (Flatten)                (None, 16384)             0         
                                                                 
 FC (Dense)                  (None, 128)               2097280   
                                                                 
 OP (Dense)                  (None, 4)                 516       
                                                                 
=================================================================
Total params: 2,485,636
Trainable params: 2,485,636
Non-trainable params: 0
_________________________________________________________________

Here’s how I created the test_generator: test_generator = core_imageDataGenerator(test_directory) and the result of len(test_generator.classes) is 394.

Here’s how I made the predictions: predictions = model.predict(test_generator) and the result of predictions.shape is [6304, 4] and not [394, 4]. What could be the reason for this? Am I doing something wrong?

I am posting this here because I think this is a bug of some sort. Also , what are my options here because my next step is to create a classification report with a variety of metrics.

Are you using image_dataset_from_directory?