Transform weights from NCHW to NHWC model

I have two models, their structure are exactly the same except

  1. First model starts with a Permute((2,3,1)) layer to convert NCHW to NHWC
  2. The data_format of Conv2D layers is channels_last in first model, while it is channels_first in the second model
            if data_format == 'channels_last':
                self.source = tf.keras.layers.Permute((2,3,1))(self.inputs)
                self.source = self.inputs

             # 2. Common Networks Layers
            self.conv1 = tf.keras.layers.Conv2D( name="conv1",
                kernel_size=(3, 3),

The first model (channels_last) is exported into TFLite for inference on edge deivice. NHWC is the only supported there.

The second model (channels_first) is used for GPU training.

I want to read the weights of each layer in the 2nd model(channels_first), transform them, and fill into the first model(channels_last) .

Can that be done?

  • How to transform weights of Conv2D layer from channels_first to channels_last.
  • I am afraid all the Dense layers are affected as well because their inputs are different in two models. Unsure how to transform them as well.
1 Like

Layers have methods get_weights() and set_weights().
These calls work with a list of weight matrices in Numpy format. You can use get_weights() to get the list of weights, pick out the matrix from this list (you have to know where it is). You can than use Numpy transpose to create a new matrix, and set that as the matrix in the new weights. I think Conv2D weights is [n-dimensional weight matrix, n-dimensional bias matrix]

old_weights = old_layer.get_weights()
orig_matrices = old_weights[0]
transposed = …
new_weights = [transposed, old_weights[1]]

Now for the … part. The Conv2D weights have 4 dimensions. You have to work out what those dimensions are for the old and new matrix, and then copy each
You have to allocate a new numpy matrix that contains a set of nxn weight matricess.
I need to draw these things out on paper to get all of the dimensions right.

This answer from StackOverflow recommends transposing the images and using the “old” matrix multiply:

The image shape is (N, C, H, W) and we want the output to have shape (N, H, W, C) . Therefore we need to apply tf. transpose with a well chosen permutation perm.

It may be possible to start transposing the weights in the convolutional layers but as you say the issue will be when you get to a flatten and dense layer, assuming you’re performing classification here.
You will need to also ensure that the flatten layer uses the correct format (first vs last is a parameter on flatten)
If this doesn’t work, another approach could be to create a custom transpose layer as the first layer of the network for tflite. That way you don’t need to change any weights throughout the network, this layer simply transposes the format of the input image into the format the rest of the model expects.
I’m answering on my phone while travelling so apologies for no links and any duplication of answers.