How to debug CustomLoss in a CustomModel with a CustomSequence?

Hi, I’m lost so I’ve put this MWE together. I need a a CustomSequence because my data won’t fit in RAM.
I need a CustomLoss function because my input looks like (x,m),y
and my loss function needs to do something like (y_true - y_pred*m)^2 eventually.
Since my loss function has more than 2 inputs I need a CustomModel (I think all of my reasons are correct).
Now I’m confused. When I set a break point on my loss function L1 and L2 and have no value associated with them … and yet all of this code runs without error. It’s as if L1 and L2 are objects and I can’t get to their values. Moreover, many of my arguments have None for their shape.

Can someone help me understand what is happening.?

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.utils import Sequence
import numpy as np


class CustomSequence(Sequence):
    def __init__(self, input_data_1, input_data_2, output_data, batch_size):
        self.input_data_1 = input_data_1
        self.input_data_2 = input_data_2
        self.output_data = output_data
        self.batch_size = batch_size

    def __len__(self):
        return int(tf.math.ceil(len(self.input_data_1) / float(self.batch_size)))

    def __getitem__(self, idx):
        batch_x1 = self.input_data_1[idx * self.batch_size:(idx + 1) * self.batch_size]
        batch_x2 = self.input_data_2[idx * self.batch_size:(idx + 1) * self.batch_size]
        batch_y = self.output_data[idx * self.batch_size:(idx + 1) * self.batch_size]

        return [batch_x1, batch_x2], batch_y


class CustomModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense_layer = keras.layers.Dense(64, activation='relu')
        self.output_layer = keras.layers.Dense(128)

    def call(self, inputs):
        x1, x2 = inputs
        x1 = self.dense_layer(x1)
        concatenated_input = keras.layers.concatenate([x1, x2])
        output = self.output_layer(concatenated_input)
        return output

    def custom_loss(self, y_true, y_pred, input_1):
        l1 = keras.losses.mean_squared_error(y_true, y_pred)
        l2 = keras.backend.mean(input_1)
        return l1+l2

    def train_step(self, data):
        x, y = data

        with tf.GradientTape() as tape:
            y_pred = self(x, training=True)  # Forward pass
            loss = self.custom_loss(y, y_pred, x[0])

        trainable_vars = self.trainable_variables
        gradients = tape.gradient(loss, trainable_vars)
        self.optimizer.apply_gradients(zip(gradients, trainable_vars))
        self.compiled_metrics.update_state(y, y_pred)
        return {m.name: m.result() for m in self.metrics}

input_data_1 = np.random.rand(5000, 128).astype('float32')
input_data_2 = np.random.rand(5000, 128).astype('float32')
output_data = np.random.rand(5000, 128).astype('float32')

batch_size = 32
custom_sequence = CustomSequence(input_data_1, input_data_2, output_data, batch_size)

model = CustomModel()
model.compile(optimizer='adam', loss='mse', metrics=['mae'])
model.fit(custom_sequence, epochs=100)

Hi @Bob_Zigon ,

Here are my inputs for above problem.

  1. In your custom loss function, l1 is the mean squared error between y_true and y_pred. l2 is the mean of the first input (input_1). When debugging, you might not see values for l1 and l2 outside the train_step method because they are computed within the context of a TensorFlow graph during training.
  2. Shapes and None Values: TensorFlow uses symbolic tensors, which means their shapes might not be fully determined until the graph is executed. Seeing None in shapes during the model building phase is common and expected.

Please let me know if it is helpful.

Thanks