@tf.function during inference?

Should @tf.function only be used during training, or should it also be used during inference?

The function I use during inference calls the model, sends the output to a softmax. and then calls argmax. Adding @tf.function to this function works well, but does it provide any benefits?

@tf.function provides some nice performance benefits and I can’t think of any particular reason not to use it at inference unless your input shapes change often or for some reason your inference logic forces recompilation.

It’s a requirement to use tf.function to even export models to the SavedModel format, so I think it makes sense: Better performance with tf.function  |  TensorFlow Core