Use of Jvm Tensorflow Tensors and NDArrays

Preet_Kamal_Bawa · March 31, 2022, 3:10am

hi we have a scala application where we are using tensorflow core platform library to use SavedModelBundle for loading tensorflow graphs and running some operations.

I know that for any Tensor or Ttype we have to close resource, and we are leveraging try-with-resources paradigm for those in scala with a functional library and also scala.util.Using. Part I am confused about is about NDArrays , if i use an api like StdArrays.ndCopyOf to copy say from TFloat32 to FloatNdArray , does this FloatNDArray has jvm based memory allocation and hence garbage collected ? I know tensors normally have to be closed to release underlying C based tensor memory. Please advise.
thanks
Preet

karllessard · March 31, 2022, 3:36pm

Hi Preet,

Yes, NdArrays other than TF tensors are allocated in the JVM memory or are backed by direct buffers that are tracked and released by the garbage collector. Hence, it is not required that you protect them using try-with-resource blocks.

Karl

karllessard · March 31, 2022, 3:56pm

On the other hand, if you can provide an example of how you are copying your arrays that could help, because StdArrays.ndCopyOf are normally used to create an NdArray copy of a standard Java arrays but not tensors like a TFloat32. NdArray.copyTo should be used for this, is this what you meant initially?

Preet_Kamal_Bawa · March 31, 2022, 5:53pm

hi Karl, yes, i am using StdArrays.ndCopyOf to copy java array to NDaRRAY - I am not using Tensor TNumber api for it, however i am curious in some cases i have to take a say a scala Array[Array[Float]] and want to convert to TFloat32 tensor and i end up doing two steps always like this
a) convert primitive array to NdArray using StdArrays.ndCopyOf
b) then use TFloat32.tensorOf overriden api to copy ndarray into a Tensor, is there a direct way to create a tensor from java primitive arrays ?

Preet_Kamal_Bawa · March 31, 2022, 5:54pm

fyi, scala Array is same thing as Java Array so there is no hop there.

Preet_Kamal_Bawa · March 31, 2022, 5:56pm

also i am curious how much of a performance hit we take in an application copying two or 3D arrays back forth to NDArray - is there usually just a deep copy of data or do NDArrays do use performance techniques to avoid taking that hit.

karllessard · April 3, 2022, 2:23pm

is there a direct way to create a tensor from java primitive arrays ?

I have to admit that it was a deliberate choice for not adding such endpoint. The reason behind this is that manipulating [2,n]-dimensional Java arrays is inherently slow, as they do a lot of dereferencing, and if performances matter they should be avoided. The idea is to start storing the data directly in a NdArray instead (the java-ndarray library does not depend on TensorFlow and can be used in any part of the project).

Now the good news is that if you want to keep using Scala multidimensional arrays and avoid this extra copy, you can write this logic by replicating the code of StdArrays.ndCopyOf and just replace the NdArrays.ofFloats(shape) by TFloat32.tensorOf(shape). All methods are publicly accessible.

And just to add up also, while your current process (creating first an NdArray from a standard array, then copying it to a tensor) may have a space complexity of O(2n), the time complexity is constant as the content of the NdArray will basically be copied using a single memcpy to the tensor memory (unless you use some fancy data types that needs some conversion but with floats, you are fine).

Preet_Kamal_Bawa · April 4, 2022, 3:49am

thanks Karl, this helps a lot. I will also see if we can avoid using java multi-dimensional array and just NDArray wherever possible.