How to convert a python list of tf.Tensors (of variable length) to a tf.Tensor of those tensors

Bryan_Carty · February 17, 2023, 10:25pm

Hi,
In my code I am calling the Adam optimizer as follows:

self.dqn_architecture.optimizer.apply_gradients(zip(dqn_architecture_grads, traibnable_vars))

But I noticed the following showing up in my logs:

2023-02-17 20:05:44,776 5 out of the last 5 calls to <function _BaseOptimizer._update_step_xla at 0x7f55421ab6d0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has reduce_retracing=True option that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.
2023-02-17 20:05:44,822 6 out of the last 6 calls to <function _BaseOptimizer._update_step_xla at 0x7f55421ab6d0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has reduce_retracing=True option that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.

On further investigation I found that I am passing python lists of tensors to the optimizer as opposed to tensors of tensors i.e. (3)

I’ve also noticed that there seems to be a memory leak as my RAM usage continues to grow the more I train the model. This makes sense because on stackoverflow I read that:

'Passing python scalars or lists as arguments to tf.function will always build a new graph. To avoid this, pass numeric arguments as Tensors whenever possible'

So, I believe the solution would be to pass a tensor of these tensors as opposed to a list. But, on trying to convert the lists to tensors using tf.convert_to_tensor(), I get the error:

'Shapes of all inputs must match: values[0].shape = [8,8,4,32] != values[1].shape = [32] [Op:Pack] name: packed'

because the tensors have varying dimensionality.

I’ve also tried using tf.ragged.constant, but get:
raise ValueError("all scalar values must have the same nesting depth")

Any help would be appreciated. Really need to get this sorted.

chunduriv · February 21, 2023, 2:10pm

@Bryan_Carty,

It seems that the issue is arising because the shapes of the tensors in your list are not the same.

In this case, you can try padding the tensors in the list to the maximum shape using tf.keras.utils.pad_sequences and then stack the padded tensors using tf.stack to create a tensor of tensors that can be passed to the optimizer.

Please refer to the working example here.

Thank you!