Categorical Crossentropy Label Smoothing


I was trying to figure out what the label_smoothing parameter did for the loss “Categorical Crossentropy” and looking at the code, I came across this (keras/keras/losses/ at v3.1.1 · keras-team/keras · GitHub):

if label_smoothing:
        num_classes = ops.cast(ops.shape(y_true)[-1], y_pred.dtype)
        y_true = y_true * (1.0 - label_smoothing) + (
            label_smoothing / num_classes

The calculation of num_classes assumes that the classes are located on the -1 axis, but the categorical_crossentropy function takes “axis” as a parameter in order to know which axis corresponds to the classes.
I don’t understand why we don’t just use :
num_classes = ops.cast(ops.shape(y_true)[axis], y_pred.dtype)

Is there something I’ve misunderstood that explains this, or is it an error?