SwitchNet, a fast transform neural network

SwitchNet is a fast transform (not transformer) based neural network that uses the Fast Walsh Hadamard transform (WHT.)
Such fast transforms have an equivelent matrix form that you could view as a fixed neural network weight matrix.
Of course something must be made adjustable but you can do that!
The gain is full connectivity at a cost of nlog2(n) operations instead of the normal n squared operations for an actual matrix.
I see tensorflow has the WHT at least in some form now:

It probably has CReLU and the other things you need:
Here is some javascript code:
And a blog post about SwitchNet:

I guess I can look for free GPU/TensorFlow access, now that the prerequisites seem to be there. I didn’t want to have to start with TensorFlow and then find I had to spend 6 months of my life writing an optimized CUDA kernel.

The exact question then is does TensorFlow have CReLU or can you have, per layer, double ReLU’s with inputs ReLU(x) and ReLU(-x).
Does TensorFlow have the fast Walsh Hadamard transform in GPU kernel form, CPU shared library form or is it even just in Python?
I particularly don’t want to have to write a GPU CUDA kernel for the WHT because I think that would take 6 months of my life. If there is only a CPU shared library form my code already is performant.
I’m sure you can set up an efficient matrix operation to apply sign flips to a column vector.