How to modify network output before calculating loss function

Simon_Walker-Samuel · January 14, 2022, 4:13pm

I’m training a 3D image classification network, and would like to modify the output of the network before the loss function is calculated. However, it’s not clear to me how best to do this in Tensorflow, so I’m looking for advice.

The output of the network (a U-net) is a one-hot-encoded 3D volume. I would like to dilate the predicted segmentation by an amount that depends on the location of each pixel within the volume. The image transforms I need to use are available in numpy (and/or scipy), but it’s not clear to me how best to incorporate that functionality between the network output and the loss function. I’ve written a custom model class with a train_step method that calls the loss function, but there doesn’t seem to be an clear way to move between tensors and numpy arrays. Or is there a different aproach I could be taking…?

Bhack · January 17, 2022, 1:43pm

What kind of numpy/scify ops do you need?

Simon_Walker-Samuel · January 18, 2022, 2:06pm

I need to experiment with a few options, but the main function will be to expand the size of the segmented region identified by the network (which will probably use skimage.draw, amongst other functions)

This is my most recent attempt:

def modify_function(y_pred):

    # Convert from one-hot encoding
    y_pred_modified = np.argmax(y_pred,axis=-1)
    
    # Do some further processing in numpy / scipy / skimage...
    
    return y_pred_modified

class TubeNetModel(keras.Model):

    def train_step(self, data):

        x, y = data

        with tf.GradientTape() as tape:
            y_pred = self(x, training=True)  # Forward pass. Result is one-hot encoded with two categories (0 and 1)

            # Call the function to modify the output of the network
            y_pred_modified = tf.numpy_function(modify_function,[y_pred],y_pred.dtype)
            # Convert result back to one-hot encoding
            y_pred_modified = tf.one_hot(tf.cast(y_pred_modified, tf.int32), 2)
            # Reshape (necessary?)
            y_pred_modified = tf.reshape(y_pred_modified,tf.shape(y_pred))
             
            # Calculate the loss   
            loss = self.compiled_loss(y, y_pred_modified, regularization_losses=self.losses)

        # Compute gradients
        trainable_vars = self.trainable_variables
        gradients = tape.gradient(loss, trainable_vars)
        # Update weights
        self.optimizer.apply_gradients(zip(gradients, trainable_vars))
        # Update metrics (includes the metric that tracks the loss)
        self.compiled_metrics.update_state(y, y_pred)
        # Return a dict mapping metric names to current value
        return {m.name: m.result() for m in self.metrics}

However, this produces the following error when the apply_gradients method is called:

ValueError: No gradients provided for any variable:

followed by a long list of model layer instances. I think this is due to the numpy function call failing and therefore no gradeints being calculated, but I’m not even sure how to debug it! Any suggestions?

Bhack · January 18, 2022, 3:26pm

It is explained here in the doc:

Since the function takes numpy arrays, you cannot take gradients through a numpy_function. If you require something that is differentiable, please consider using tf.py_function.

If you just need a dilation you could try to add a dilation layer:

We have also a subset of numpy API (with some limits) available at: