Is Depth_to_space function totally different to Pixelshuffle?

Hi all.
Currently, I am working on copying a trained pytorch model to tensorflow 2.3 platform.
For the Conv2d layers, the feature map output of pytorch and tensoflows are the same. Thus, the conversion for conv2d layers from pytorch to tf2 are all fine now.
However, when I add a Depth_to_space layer to the model, the output contains lots of mosaic, whereas the output of Pixelshuffle does not have this problem.
I am curious that is the mechanism of Depth_to_space are different to Pixelshuffle?
Or is my code has some problem?

Here is part of my code.
Pytorch model
some linear mapping layer with nn.Conv2d
self.pslayers = [nn.Conv2d(d, 412,3 ,1, 3//2)]
self.pslayers.extend([nn.Conv2d(12, 4
num_channels,3 ,1, 3//2)])
self.pslayers = nn.Sequential(*self.pslayers)

The TF2 model:
some linear mapping layer with Conv2d
x = Conv2D(4*12, 3, padding=“same”)(x)
x = tf.nn.depth_to_space(x, 2, data_format=‘NHWC’)
x = Conv2D(4, 3, padding=“same”)(x)
out = tf.nn.depth_to_space(x, 2, data_format=‘NHWC’)

The conversion for Conv2d layer from torch weights to tf2:
onnx_1_w_num =, 3, 1, 0)
onnx_1_b_num =

I am struggling with this problem for a long while.
Very appreciate for those helping me!

Probably the organization of the channels in your code is problematic. You can refer to the following example for a proper usage:

1 Like

Thanks for reply.
May I know that the organization of the channels you mean is the output of Conv2D layer or the weights for Conv2d when I copying it from pytorch to tensorflow model?
For the Conv2d layer, I am quite sure that the output of nn.Conv2d(d, 4x12,3 ,1, 3//2) are equal to Conv2D(4x12, 3, padding=“same”)(x) as I checked their feature map output.
Is the problem occur on tf.nn.depth_to_space?
May I have further hints for that?

I found that the depth_to_space work fine (mosaic gone) when I reduce the upsampling to 1 time (x4 upsampling by 1), i.e.
x = Conv2D(4*4, 3, padding=“same”)(x)
x = tf.nn.depth_to_space(x, 4, data_format=‘NHWC’)
However, the mosaic is still there when upsampling 2 times (x2,x2).
Do you have any idea for that?