Segfault on allocate_temp

Hi, first question here, sorry if wrong format/category:

I’m writing a custom op (using the custom-op Docker images mentioned in the tutorial/repo) which is trying to allocate a temporary during the op but hits a segfault, which I have a hard time to understand. The snippet where this happens is inside the ::Compute method of an OpKernel, where I’ve got

Tensor *q_t;
const TensorShape &sh = TensorShape({sdr->sht->nlm});
OP_REQUIRES_OK(context, context->allocate_temp(DT_COMPLEX128, sh, q_t));

And that allocation results in a segfault:

Thread 1 "spharde_ops_tes" received signal SIGSEGV, Segmentation fault.
0x00007fe3c216b973 in tensorflow::OpKernelContext::allocate_tensor(tensorflow::DataType, tensorflow::TensorShape const&, tensorflow::Tensor*, tensorflow::AllocatorAttributes, tensorflow::AllocationAttributes const&) ()
   from /usr/local/lib/python3.6/dist-packages/tensorflow/python/../libtensorflow_framework.so.2

with a backtrace like so

#0  0x00007fe3c216b973 in tensorflow::OpKernelContext::allocate_tensor(tensorflow::DataType, tensorflow::TensorShape const&, tensorflow::Tensor*, tensorflow::AllocatorAttributes, tensorflow::AllocationAttributes const&) ()
   from /usr/local/lib/python3.6/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#1  0x00007fe3c216e696 in tensorflow::OpKernelContext::allocate_temp(tensorflow::DataType, tensorflow::TensorShape const&, tensorflow::Tensor*, tensorflow::AllocatorAttributes, tensorflow::AllocationAttributes const&) ()
   from /usr/local/lib/python3.6/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#2  0x00007fe3a5895190 in tensorflow::OpKernelContext::allocate_temp (this=0x7ffe37e7d710, type=tensorflow::DT_COMPLEX128, shape=..., out_temp=0x5727690, allocator_attr=...)
    at /usr/local/lib/python3.6/dist-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h:1025
#3  0x00007fe3a5895217 in tensorflow::OpKernelContext::allocate_temp (this=0x7ffe37e7d710, type=tensorflow::DT_COMPLEX128, shape=..., out_temp=0x5727690) at /usr/local/lib/python3.6/dist-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h:1029
#4  0x00007fe3a5899ae2 in ApplyShtDiffOp<Eigen::ThreadPoolDevice>::Compute (this=0x574e380, context=0x7ffe37e7d710) at tf_spharde/cc/kernels/shtdiff_kernels.cc:101

I could understand an out-of-memory or whatever but why the segfault? Any tips would be greatly appreciated.

I seem to have fixed it by using

Tensor q_t;
const TensorShape &sh = TensorShape({sdr->sht->nlm});
OP_REQUIRES_OK(context, context->allocate_temp(DT_COMPLEX128, sh, &q_t));

but it’s not clear to me why that works. When running the op, I get an abort, double-free or corruption.

From reading use of allocate_temp in the TF codebase, this seems like it should work.

this was just the rest of the kernel misbehaving. I am still curious about the segfault on alloc…

Oops this is a dumb question: the above just declares a pointer, but there’s nothing for allocate_temp to intialize. Other routines would return a usable pointer with Tensor ** type argument, but allocate_temp takes just a Tensor * which needs to point to a usable Tensor instance (even if not allocated). That’s why declaring a Tensor on the stack and then allocating it works via address:

Tensor x;
context->allocate_temp(..., &x);

In allocate_temp its failing at:
*out_temp = new_temp;
Which makes sense because in the first example youre passing in a pointer so out_temp becomes a pointer to a pointer I am pretty sure? Where as in the second example you did everything correctly.

Oh my bad, posted before I saw your reply

No worries, I guess I shouldn’t have asked the question in the first place. I sort of forget some c++ fundamentals while trying to get my head around the TF API…

Nah man youre fine! Doesnt seem like anyone else knew the answer either :stuck_out_tongue: