Why do state stops gradients in tf.GradientTape?

x0 = tf.Variable(3.0)
x1 = tf.Variable(0.0)

with tf.GradientTape(persistent=True) as tape:
  # Update x1 = x1 + x0.
  x1.assign_add(x0)
  # The tape starts recording from x1.
  y = x1**2   # y = (x1 + x0)**2

# This doesn't work.
print(tape.gradient(y, x0))   #dy/dx0 = 2*(x1 + x0)

The result of the above code is None, and I want to know why it is designed this way?
Thanks.

@yangchenhao,

Shouldn’t it be print(tape.gradient(y, x1)) as you are adding x0 to x1 during the gradient tape?

Thank you!