How to MatMul two tensors inside custom c++ OP?

I’m trying to port Deformable Conv V2 to tensorflow and got a trouble: there is no any docs about how to estimate a usual matmul of 2 tensors inside c++.

Can someone help with example?

P.S.
There is a lot of computations for that matmul, so i expect that is should be something like a CPU/GPU functor template.

Take a look at:

More in general If you are interested in custom ops:

https://tensorflow-prod.ospodiscourse.com/t/deformable-convolution-and-other-custom-ops/1951?u=bhack