Call model inference in C/C++ from inputs, allocated in GPU memory

Yury_Lysogorskiy · December 1, 2023, 7:44am

Dear all,

I’d like to use TF-model in scientific simulation code, written in C++. This code allow to run simulation on GPU, so all necessary input data could be already placed on GPU.

In order to call TF model, I’m planning to use TF_NewTensor

Now my question is: Is it possible to control, where to place TF_Tensor? Can I just wraped it around existing on-GPU array to avoid CPU-to-GPU- memory transfer?

Thank you very much in advance!