Dataset creation

Hello,

I’m new to tensorflow and machine learning. I would like to create a new dataset to train a CNN. The objective of this project is not classification but prediction. I would like to predict from an image (9x101) created from sensor data, a matrix that contains 101 points. Most of the documentation is about creating a dataset to do classification. Any help, guidance or links to associate the image with the output matrix in a dataset.

1
image

Best regard,
Simon

@Ozan_Simon Welcome to Tensorflow Forum!

Here’s a guide to creating a dataset for predicting a 101-point matrix from a 9x101 image using a CNN in TensorFlow:

  1. Data Collection and Organization: Ensure you have a collection of images (9x101) and their corresponding output matrices (101x1, assuming single-value points).
    Create a directory structure like:
    dataset/
    train/
    images/
    image1.jpg
    image2.jpg

    matrices/
    matrix1.npy
    matrix2.npy

    validation/
    … (similar structure)
    test/
    … (similar structure)`

  2. Data Loading and Preprocessing: Use TensorFlow’s tf.keras.preprocessing.image.load_img and tf.keras.preprocessing.image.img_to_array to load and convert images to NumPy arrays. Use numpy.load to load .npy files containing the output matrices. Normalize both images and matrices to a range of 0 to 1 for better training stability.

  3. Dataset Creation: Create a dataset for training, validation, and testing: Pythontrain_ds = tf.data.Dataset.from_tensor_slices((images_train, matrices_train)). Apply appropriate shuffling and batching:

  4. Model Architecture: Design a CNN architecture with convolutional layers, pooling layers, and flattening layers to extract features from the images. Add fully connected (dense) layers to process the flattened features and output a 101-point matrix. Use a linear activation function (e.g., tf.keras.activations.linear ) for the output layer, as you’re predicting continuous values.

  5. Training: Define loss function (e.g., mean squared error) and optimizer (e.g., Adam). Fit the model on the training dataset, monitoring validation loss:
    Pythonmodel.fit(train_ds, epochs=10, validation_data=val_ds)

  6. Prediction: Load and preprocess a new image for prediction. Use the trained model’s predict method:
    Pythonpredicted_matrix = model.predict(new_image)

Let us know if this helps!