Code error using Gradient Tape

tzahi_geller · July 4, 2022, 2:54am

Hi all,
I tried to implement a very basic classification algorithm using tensorflow API
the steps are:

creating synthetic data
define the architecture prediction = tf.matmul(inpurs,W) + b
iterate on training step

For some reason the GradientTape instance could not find W,b so I used local function variables
the code is:

import tensorflow as tf
input_dims=2
output_dims=1

W = tf.Variable(initial_value = tf.random.uniform((input_dims,output_dims)))
b = tf.Variable(initial_value = tf.random.uniform((output_dims,)))

def square_loss(preds,labels):
loss = tf.square(preds-labels)

return tf.reduce_mean(loss)

def training_step(data,labels,local_W,local_b):
with tf.GradientTape() as g:
predictions = tf.matmul(data,local_W)+local_b
loss = square_loss(predictions,labels)

grad_W,grad_b = g.gradient(loss, [local_W,local_b])
return loss,grad_W,grad_b

step_size = 0.02
loss_val = []
for i in range(40):
#print(W)
(loss,grad_W,grad_b) = training_step(data,targets,W,b)
loss_val.append(np.array(loss))
#print(grad_W)
W = tf.add(W,grad_W*(-step_size))
b = tf.add(b,grad_b*(-step_size))

What happens is at iteration 2 the gradients calculated by GradientTape instance are none type
I checked the loss function (on which gradient is calculated) and it has a valid number
Can someone please explain why the gradient calculation is not working?

Mark_Strefford · July 11, 2022, 9:51am

Can you structure your code better. It’s partially marked with code and mostly not, there’s no indents, etc.

Also, this is an example of how I’ve used gradientTape successfully:

def step(real_x, real_y):
    with tf.GradientTape() as tape:
        # Make prediction
        pred_y = model(real_x.reshape((-1, 28, 28, 1)))
        # Calculate loss
        model_loss = tf.keras.losses.categorical_crossentropy(real_y, pred_y)
    
    # Calculate gradients
    model_gradients = tape.gradient(model_loss, model.trainable_variables)
    # Update model
    optimizer.apply_gradients(zip(model_gradients, model.trainable_variables))

This is from section 2.2 of tf.GradientTape Explained for Keras Users | by Sebastian Theiler | Analytics Vidhya | Medium

I didn’t see an optimiser.apply_gradients() call above, you seem to be trying to apply them manually.

tzahi_geller · July 13, 2022, 7:51am

So instead of using the optimizer I am just calculating the gradient and tweak the weights with a fixed learning rate
I saw this done in a book called Deep learning with python, it implements model fit basics without usage of keras (actually the writer was one of keras developers).
I will try to load my code better

Thanks!