Behavior of`tf.GradientTape` vs `torch.autograd`!

I am trying to translate a torch implementation in TensorFlow and faced some gradient-level issues while implementing it in TensorFlow. I already asked here for reproducible code.

If you have any suggestions or feedback on it, that would be highly appreciated. Thank you.