ValueError: No gradients provided for any input

I’ve tried to run my model - the UDP check is irrelevant now in the function call. This model takes 100 steps and 2 features per sample - hence the (100, 2) input shape.

class KnownDetector(Model):
    def __init__(self):
        super(KnownDetector, self).__init__()
        self.TCP = tf.keras.Sequential([
            layers.Conv1D(filters=32, kernel_size=3, activation="relu", input_shape=(100, 2)),  # 100 packets, 2 features
            layers.MaxPool1D(pool_size=3, padding='same'),
            layers.Conv1D(filters=32, kernel_size=3, activation="relu"),  # 100 packets, 2 features
            layers.MaxPool1D(pool_size=3, padding='same'),
            layers.Conv1D(filters=32, kernel_size=3, activation="relu"),  # 100 packets, 2 features
            layers.Dense(128),
            tf.keras.layers.Softmax()
        ])

    def call(self, x):
        flag = True  # check if x is TCP or UDP in the future
        if flag:
            return self.TCP(x)

fx = KnownDetector()
fx.compile()
fx.fit(train, train_labels, epochs=1, validation_data=(test, test_labels), batch_size=32)

However, I keep getting this error:

I’ve looked online and it seems like many of the solved issues aren’t relevant to me. Why does this happen and how do I fix it?

In a .compile() method you did not specify optimizer and loss function. Loss is required to calculate gradients.

1 Like

Thanks. I tried adding an optimizer and a loss function before posting this, but I received an ambiguous error instead.

fx.compile(loss=tf.keras.losses.CategoricalCrossentropy(), optimizer='sgd')
    ValueError: Shapes (None, 1) and (None, 9, 128) are incompatible

I’m not sure where these shapes come from, as none of my data lookes like this. However, it is triggered by the fit function. Right before the fit function everything is shaped as such:

>>>print(test_labels.shape, test.shape, train_labels.shape, train.shape)
(4503,) (4503, 100, 2) (13508,) (13508, 100, 2)

This error says that it’s impossible to compare targets that have shape (None, 1) and model output that has shape (None, 9, 128). None stands for batch dimension. You can ignore it. All the rest is what you define and pass to the model.
In a classification model the final dense layer should have the number of units equal to the number of classes. I suspect that 128 is not a number of classes but arbitrarily chosen number of units.
Output shape (None, 9, 128) means that you need to add Flatten before the final Dense layer. It will eliminate 9.
When the targets are in a sparse format (they are not one-hot encoded and represent just one column with different labels), the loss should be SparseCategoricalCrossentropy.

1 Like

Thank you!

Could you clarify on the final dense layer having an equal amount of units to the number of classes? The paper I’m implementing isn’t clear - it has 128 layers before the final softmax layer - there aren’t 128 classes. In this basic model with my data, there are only 5 classes. The labels are numbered [1, 5].

I’ve changed to SparseCategoricalCrossEntropy and added a flatten layer, in such a way:

class KnownDetector(Model):
    def __init__(self):
        super(KnownDetector, self).__init__()
        self.TCP = tf.keras.Sequential([
            layers.Conv1D(filters=32, kernel_size=3, activation="relu", input_shape=(100, 2)),  # 100 packets, 2 features
            layers.MaxPool1D(pool_size=3, padding='same'),
            layers.Conv1D(filters=32, kernel_size=3, activation="relu"),  # 100 packets, 2 features
            layers.MaxPool1D(pool_size=3, padding='same'),
            layers.Conv1D(filters=32, kernel_size=3, activation="relu"),  # 100 packets, 2 features
            layers.Dense(128),
            layers.Flatten(),
            tf.keras.layers.Softmax()
        ])
    def call(self, x):
        flag = True  # check if x is TCP or UDP in the future
        if flag:
            return self.TCP(x)
#

fx = KnownDetector()
fx.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(), optimizer='adam', metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])
fx.fit(train, train_labels, epochs=1, validation_data=(test, test_labels))

This actually trains the model with ~40% accuracy.

If I try to make the last dense layer into layers.Dense(5), the accuracy is in the single digits. This doesn’t make sense to me - if the final layer should be the number of classes, how does having 128 of them fit at all? I don’t have 128 labels. Could there be an issue with my call function?

Note: changing the SoftMax layer to have 5 dimensions bring a completely different error.

Here is an example of a basic classification model: Basic classification: Classify images of clothing  |  TensorFlow Core.
Classes should be labeled starting from 0, because the model outputs class probabilities (or logits), and column indexes are used to identify predicted class.
If you have 5 classes, the final dense layer should have 5 units. You can insert Flatten layer after the last Conv1D, then Dense(128, activation=‘relu’), then Dense(5, activation=‘softmax’).
If you do not specify ‘softmax’ in the final layer, you should define loss as
SparseCategoricalCrossentropy(from_logits=True).