How we can compare loss and metrics evaluated on different sizes of training and validation sets?

Let N_train, N_val and N_test are the number of examples in training, validation, and test sets.

As I understood, in general these values are taken as N_train >> N_val ~= N_test.

As I understood, the loss and metrics are evaluated (in an average sense) on the whole training set. In this context, how can we compare the performance on sets of different sizes?

Why isn’t it like model performance is evaluated on a subset of the training set whose size is comparable to that of the validation or test set?

One can argue that it might increase the computational cost, but we can at least draw (randomly) from the losses evaluated during the training step.

Please let me know if there are any reasons behind this approach!

Hi @Rajesh_Nakka, The loss is calculated based on training data not val, test data. During the training, the model finds the loss w.r.t to the train data (batch wise) and updates the weights. The trained model will be used to test on val data. Once the model is trained for specified epochs, the final model is used to run inference on test data. Please make sure that train set , val set and test set are independent to each other and come from same distribution. In short, the size of the val set and test set does not affect the loss but the train set does. Thank You.