Best way to compute Hessian-vector product?

I am trying to compute the Hessian-vector product. The documentation on Advanced automatic differentiation (Advanced automatic differentiation  |  TensorFlow Core) links to:

However, this has half a dozen different ways of implementing it! Which is most efficient? I tried to run the benchmark to get those numbers, but I couldn’t figure out how.

Hi @Cheerful_Squirrel

You can use tf.GradientTape.jacobian for the Hession Vector product. Please refer to this Hessian matrix example using tf.GradientTape.jacobian.


x = tf.random.normal([7, 5])
layer1 = tf.keras.layers.Dense(8, activation=tf.nn.relu)
layer2 = tf.keras.layers.Dense(6, activation=tf.nn.relu)

with tf.GradientTape() as t2:
  with tf.GradientTape() as t1:
    x = layer1(x)
    x = layer2(x)
    loss = tf.reduce_mean(x**2)
  g = t1.gradient(loss, layer1.kernel)
h = t2.jacobian(g, layer1.kernel)

print(f'layer.kernel.shape: {layer1.kernel.shape}')
print(f'h.shape: {h.shape}')

Output:

layer.kernel.shape: (5, 8)
h.shape: (5, 8, 5, 8)