Odd behavior with custom loss functions when traininig

Andre_dubbs · March 29, 2024, 4:29am

So im trying to implement a weighted loss function, i took two different approaches. They both yield the same loss when i take the predictions of my model and compute the loss between predictions and ground truths. However, when i pass the loss functions as a parameter in model.compile and then i use model.fit, they get vastly different results. Any ideas why that could be?

First Approach (just using python functions):

    def get_weighted_loss(pos_weights, neg_weights, epsilon=1e-7):
        def weighted_loss(y_true, y_pred):
            # computes the weighted binary cross entropy loss
            loss = -1 * keras.src.backend.numpy.mean(
                pos_weights * y_true[:] * keras.src.backend.numpy.log(y_pred[:] + epsilon) +
                neg_weights * (1 - y_true[:]) * keras.src.backend.numpy.log(
                    1 - y_pred[:] + epsilon))  # complete this line
            return loss
        return weighted_loss

Implementation:

    def train_model(self, epochs=10):
        freq_pos, freq_neg = dsu.compute_class_frequency(self.y_train)
        # Compiles model for training purposes
        self.model.compile(optimizer=keras.optimizers.AdamW(),
                           loss=self.get_weighted_loss(freq_neg, freq_pos),
                           metrics=['accuracy'])
        # Trains or fits the models training data
        history = self.model.fit(self.x_train, self.y_train, validation_data=(self.x_valid, self.y_valid),
                                 epochs=epochs, batch_size=32)
        return history

Result:
Gets great accuracy at around 3 training epochs, 0.95 - 0.98 ish

Second Approach (using keras loss class inheritance):

class SingleClassWeightedLoss(keras.losses.Loss):
    def __init__(self, pos_weight, neg_weight, epsilon=1e-7):
        super(SingleClassWeightedLoss, self).__init__()
        self.name = 'WeightedLoss'
        self.neg_weight = neg_weight
        self.pos_weight = pos_weight
        self.epsilon = epsilon

    def call(self, y_true, y_pred):
        # computes the weighted binary cross entropy loss
        loss = -1 * keras.src.backend.numpy.mean(
            self.pos_weight * y_true[:] * keras.src.backend.numpy.log(y_pred[:] + self.epsilon) +
            self.neg_weight * (1 - y_true[:]) * keras.src.backend.numpy.log(
                1 - y_pred[:] + self.epsilon))
        return loss

Implementation:

    def train_model(self, epochs=10):
        freq_pos, freq_neg = dsu.compute_class_frequency(self.y_train)
        # Compiles model for training purposes
        self.model.compile(optimizer=keras.optimizers.AdamW(),
                           loss=lsu.SingleClassWeightedLoss(freq_neg, freq_pos),
                           metrics=['accuracy'])
        # Trains or fits the models training data
        history = self.model.fit(self.x_train, self.y_train, validation_data=(self.x_valid, self.y_valid),
                                 epochs=epochs, batch_size=32)
        return history

Result:
terrible accuracy, around 0.40-0.45 ish after 10 epochs

Paradox (i compute the loss using both approaches):

my_preds = my_nn.batch_predict(my_nn.x_train, normalize=True)
my_ground_truth = my_nn.y_train
my_loss_fn = my_nn.get_weighted_loss(my_neg_freq, my_pos_freq)
loss = my_loss_fn(my_ground_truth, my_preds)
print(loss)
class_loss_fn = lsu.SingleClassWeightedLoss(my_neg_freq, my_pos_freq)
loss = class_loss_fn(my_ground_truth, my_preds)
print(loss)

both loss functions yield the same result:

why is the first approach giving me good results and the second approach giving me terrible results?