Hi, new here. Question about training epoch in general. I’m trying to create a custom training loop from scratch, and have a question about epochs.
Let’s say I have a function that randomly extracts a number of batch size from training data. Something like this:
def random_batch(X, y, batch_size=32):
idx = np.random.randint(len(X), size=batch_size)
return X[idx], y[idx]
Let’s say I will train for 50 epochs, and within each epoch, I will use the above function on the dataset 1000 times because training set is 32,000 samples.
My question is, for each epoch, is it necessary to ensure that EVERY sample in the training data gets through the network once? Or is it okay if I just randomly select from 32 samples. In other words, do I need an additional step in the code where I drop the 32 samples from the training data after having them gone through the network, so that in future steps those samples won’t be learned again. This is to ensure every sample is learned once. Or is this not necessary?
Thanks and sorry for the newbie question!