Why should I set 'steps_per_epoch' manually?

Hi, there.
I’m new.
I’m using TF2 to train LSTM model to predict stock price.
It sounds crazy, but I just want to take a try.

I have a basic question about parameter ‘steps_per_epoch’ of model.fit().
As we know, if we shuffle dataset using repeat() function,
then we must set some values to ‘steps_per_epoch’.
So, here is the quesion.
I set samples’s dataset to TF,
I set batch_size to TF,
and I still have to set ‘steps_per_epoch’ manually by myself.
It is unreasonable.
Since TF2 knows “len(dataset_train)” and “batch_size”,
why can’t TF2 calculate ‘steps_per_epoch’ using “len(dataset_train) / batch_size”,
and set “len(dataset_train) / batch_size” to ‘steps_per_epoch’ automatically by TF2 itself?

code are blow.

self.dataset_train.cache().shuffle(buffer_size).batch(self.batch_size, drop_remainder=True).repeat()

self.steps_per_epoch = 200
self.hist = self.model.fit(self.dataset_train,
verbose=self.verbose, callbacks=tensor_board)

Hi @blackdove0430, When training with input tensors such as TensorFlow data tensors, the default None is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined.

If x is a tf.data dataset, and steps_per_epoch is None, the epoch will run until the input dataset is exhausted.

When passing an infinitely repeating dataset, you must specify the steps_per_epoch argument.

Thank You.

Hi @Kiran_Sai_Ramineni

Thank you very much.
I think I understand now.

when using repeat() function, it means repeat forever(dataset will never exhausted),
so we must specify the steps_per_epoch some values to end the training.
On the other hand, if do not use repeat() function,
TF2 will calculate ‘steps_per_epoch’ by using “len(dataset_train) / batch_size” itself.

Here comes the next quesion.
In order to get a better training result, what kind of appropriate number should be set to ‘steps_per_epoch’?
For example,
“num_1 = len(dataset_train) / batch_size”
a = num_1
b = num_1 * 2
c = num_1 * 3
d = num_1 * 10

a,b,c,d, which one will be the best value.
I will do some test to see what will happen.

len(dataset_train) / batch_size should be good for steps_per_epoch. Thank You.

Thank you so much.
Thank you for your answer.

Yes you are correct.

So you know what repeat() operation does.

Now lets say you have fixed length data no repeat. You defined batch size 10, and your entire dataset is 100 entries, therefore each epoch will consume whole data set of 100 entries and since batch size is 10, it will do this operation in 10 steps, which is what steps_per_epoch mean.

Now lets say your batch size is 32, and you used self.steps_per_epoch = 200, it means total 32 * 200 (6400) rows / entries from the repeated dataset will be used to train the model in every epoch.