How to MatMul two tensors inside custom c++ OP?

I’m trying to port Deformable Conv V2 to tensorflow and got a trouble: there is no any docs about how to estimate a usual matmul of 2 tensors inside c++.

Can someone help with example?

P.S.
There is a lot of computations for that matmul, so i expect that is should be something like a CPU/GPU functor template.

Take a look at:

More in general If you are interested in custom ops: