How to do batch prediction in pyspark?

Hi Team, I have built the tf2 model which does image classification. Currently, when I do batch inference for 50 images it’s takes 42secs. And I did it in sequential way. Is there a way to do batch prediction using pyspark( parallel prediction) because I got 70K images to do inference as batch everyday.

Can anyone please help me here to solve the problem?

Please find me script below,

import tensorflow as tf, numpy as np
from PIL import Image, ImageOps


model = tf.keras.models.load_model(‘export/model’)


image = ‘new_image.png’
image = Image.open(image).convert(‘RGB’)
image = ImageOps.exif_transpose(image)
image = np.array(image.resize((224,224)))
image = np.reshape(image,(1,224,224,3))
prediction_result = model.predict(image)

Can anyone please help me to do inference using pyspark.

Currently, I use for loop to iterate the prediction sequentiality.
#pyspark
#tf2 #keras #inference #tensorflow #batchinference

1 Like

could you please provide an answer to this?