How to measure data fetching, forward and backward pass time during training

purvang_lapsiwala · September 21, 2023, 4:07am

How to measure Data fetching, Data preparation, forward and backward pass time for each epoch during training script?

Kiran_Sai_Ramineni · September 21, 2023, 8:54am

Hi @purvang_lapsiwala, You can get the time in the model.fit logs which will be time taken/step. Generally the training step is one gradient update i.e. in one step batch_size examples are processed. If your model has data pre processing steps defined, the time taken for the pre processing is also calculated and displayed in the logs. If you want to calculate the time separately for fetching, preprocessing, and gradient update you have to write your custom training loop. Thank You.

purvang_lapsiwala · September 22, 2023, 2:45pm

@Kiran_Sai_Ramineni Thank you for your reply. After writing custom loop, my loss (very high value) and accuracy (low) doesn’t match compared to when I use model.fit function.

I followed below script link to write custom training loop. تدريب مخصص مع tf.distribute.Strategy | TensorFlow Core.

fit:
loss : 0.2 miou : 0.61

custom:
loss : 13107.74 miou : 0.1457