Train a model on large dataset

I have loaded train and valid images in X_train, and X_valid respectively.
X_train shape is: (6000,100,224,224,3), y_train is (6000, 5)
X_valid shape is: (2000, 100,224,224,3), y_valid is (6000,5)

How to train the model on this large data.
Thank you

Hi @youb,

I recommend following the tutorial on “Image Classification with Model Garden” provided by TensorFlow. This tutorial will guide you through the process of training a custom dataset using pre-trained models. It serves as a great starting point for your task.

Once you have gone through the tutorial, you can further explore the available pre-trained models in the TensorFlow Official Models GitHub repository. The repository provides a wide range of pre-trained models for image classification. You can experiment with different models to see which one works best for your specific dataset and requirements.

By combining the tutorial and exploring the TensorFlow Official Models repository, you’ll have a solid foundation for training image classification models on your large dataset. This will allow you to leverage the power of pre-trained models and customize them for your specific needs.

TensorFlow documentation that can help you with training models on large datasets:

  1. TensorFlow Data API Overview: Build TensorFlow input pipelines  |  TensorFlow Core
  2. The high-level API for TensorFlow: Keras: The high-level API for TensorFlow  |  TensorFlow Core
  3. TensorFlow model optimization: TensorFlow model optimization  |  TensorFlow Model Optimization
  4. Serialization and saving: Serialization and saving  |  TensorFlow Core

Please let me know if you need any further help.


my question is not about using pre-trained model; I am very familiar with that; the problem that I have is I am not able to train the model; loaded data in arrays: x_train, y_train…

I am asking if there are some ways to train a deep model on large dataset;

Thank you :slight_smile:

Hi @youb ,

Sorry for the confusion.

Can you check the below code sample that might help you train the model on the large dataset:

import tensorflow as tf

# Define your model architecture
model = tf.keras.Sequential([
    # Add your desired layers here
    tf.keras.layers.Conv2D(no of filters or output channels ,filter size, activation='relu', input_shape=(x,y,z))
    tf.keras.layers.Dense(5, activation='softmax')

# Compile the model

# Define batch size and number of epochs
batch_size = 16 or 32 or 64
epochs = 10

# Create a TensorFlow Dataset for training data
train_dataset =, y_train))
train_dataset = train_dataset.batch(batch_size)

# Create a TensorFlow Dataset for validation data
valid_dataset =, y_valid))
valid_dataset = valid_dataset.batch(batch_size)

# Train the model, validation_data=valid_dataset, epochs=epochs)

The above code is taken from the Tensorflow code examples

Is the information I provided what you were looking for? Please let me know if there is anything else you need or if you have any specific expectations or requirements.


Does not work;
What i have done, i have loaded images into X_train, and labels in y_train,
X_train: (6000,6,224,224,3), y_train: (6000,5)
X_valid: (2000,6,224,224,3), y_train: (2000,5)
then I have used
train_ds =,y_train))
valid_ds =, y_valid))
… shuflle, batch, prefetch

But i not able to train the model even I have a machine (GPU 64, CPU:64)

Thank you

Hi @youb ,

Can you provide the error logs you are encountering and share a standalone code snippet that I can review?



This might help you

I don’t understand what is the size of your datasets. The train dataset has 6.000 samples? If so you would have 6.000 images with shape (224, 224, 3) thus I would expect that the shape of your training dataset would be a four dimensional tensor with shape (6000, 224, 224, 3) but instead you have a five dimensional tensor. It seems you have labels for 6.000 images for training but your images are 6.000 × 6= 36.000 !

Your problem is not so much the model or the dataset, but how you train.

Instead of loading everything in memory, you could save your datasets (inputs, labels) into an hdf5 file (use h5py for that) and then use tfio (GitHub - tensorflow/io: Dataset, streaming, and file system extensions maintained by TensorFlow SIG-IO) to create a dataset from that file. I’m sorry I cannot provide you a more detailed explanation on how to do that but I hope the pointers to h5py and tfio can already help you out.

I also do not understand the shape of your datasets, I would expect (sample, height, width, channels) i.e. a 4-dimensional dataset but somehow you’ve got 5d ones.

Kind regards,