U-Net runs very slow

Hello everyone!

I have written a U-Net with Python, TensorFlow and keras. With this I want to predict CT images from MR images. It’s nothing fancy and the code is very well commented:

import csv
import nibabel as nib
import tensorflow as tf

# U-Net architecture
def UNet_model_2D(img_height, img_width, clr_channels):
    
    inputs = tf.keras.layers.Input( shape=(img_height, img_width, clr_channels) )
    
    # contraction path
    c1 = tf.keras.layers.Conv2D(32, (3,3), activation='relu', kernel_initializer='he_normal', padding='same')(inputs)
    c1 = tf.keras.layers.Dropout(0.1)(c1)
    c1 = tf.keras.layers.Conv2D(32, (3,3), activation='relu', kernel_initializer='he_normal', padding='same')(c1)
    p1 = tf.keras.layers.MaxPooling2D((2,2))(c1)
    
    # ...
    
    # base
    c5 = tf.keras.layers.Conv2D(512, (3,3), activation='relu', kernel_initializer='he_normal', padding='same')(p4)
    c5 = tf.keras.layers.Dropout(0.3)(c5)
    c5 = tf.keras.layers.Conv2D(512, (3,3), activation='relu', kernel_initializer='he_normal', padding='same')(c5)
    
    # expansive path 
    u6 = tf.keras.layers.Conv2DTranspose(256, (2,2), strides=(2,2), padding='same')(c5)
    u6 = tf.keras.layers.concatenate([u6, c4])
    c6 = tf.keras.layers.Conv2D(256, (3,3), activation='relu', kernel_initializer='he_normal', padding='same')(u6)
    c6 = tf.keras.layers.Dropout(0.2)(c6)
    c6 = tf.keras.layers.Conv2D(256, (3,3), activation='relu', kernel_initializer='he_normal', padding='same')(c6)
     
    # ...
    
    outputs = tf.keras.layers.Conv2D(1, (1, 1), activation='linear')(c9) 
    model = tf.keras.Model(inputs=[inputs], outputs=[outputs])
    
    # config model with losses and metrics
    model.compile(optimizer='adam', loss=['MeanAbsoluteError'])

    model.summary()
    
    return model

################################ main program #################################

# load training data concatenated along 0th dimension
X = nib.load('MRIs_train.nii.gz').get_fdata()
Y = nib.load('CTs_train.nii.gz').get_fdata()

# load validation data concatenated along 0th dimension
x_val = nib.load('MRIs_val.nii.gz').get_fdata()
y_val = nib.load('CTs_val.nii.gz').get_fdata()

# files containing the loss values over time will be added to this folder
callbacks = [tf.keras.callbacks.TensorBoard(log_dir='log_survial')]
epochs = 50

# get shape of concatenated MRIs/CTs
s = X.shape

model = UNet_model_2D(s[1], s[2], 1)

# return a History object whose attribute '.history ' is a record of training 
# loss, metrics, validation loss, and validation metrics values
results = model.fit(
    x=X,  # input data
    y=Y,  # ground truth data
    batch_size=16, 
    epochs=epochs,
    verbose=1,
    callbacks=callbacks, 
    validation_data = (x_val, y_val),
    use_multiprocessing=True,
)

tmp = list(results.history.values())

train_loss=tmp[0][:]
val_loss=tmp[1][:]

# write/append csv file
f = open('log_train_loss.csv', 'a')
writer = csv.writer(f)
writer.writerow(train_loss)
f.close()

f = open('log_val_loss.csv', 'a')
writer = csv.writer(f)
writer.writerow(val_loss)
f.close()

model.save('pCT_2D_' + 'ep', save_format='tf')

I started training over 5 hours ago (112 concatenated 3D-MR/CT images as training and 25 concatenated 3D-MR/CT images as validation) and it’s not even through the first of 50 epochs.

One guess is that Spyder is not using the available 32 processor cores and/or not using the Quadro GV100 32 GB GPU. I also have 512 GB of RAM at my disposal.

The only thing I did in the Python code regarding multiprocessing is to set use_multiprocessing=True.

Thanks in advance for ANY support!