Expand the head of my model with hidden layers from MobileNetV2

I’m using a model which has for the body a MobileNetV2 and for the head a fully connected layer with softmax. I’m also adding a replay buffer, to enable continual learning capabilities. I am trying to copy a few of the hidden layers from the base of my model (The MobileNetV2) and add them to my head before the fully connected layer. I’d like to get a feedback if I’m doing it the correct way since performing experiments on the CoRE50 benchmark it seems that the accuracies of different number of hidden layers remain the same. This is my code for the head:

def buildHeadHidden(self, sl_units=32, hidden_layers=1):

        base_model = MobileNetV2(input_shape=(self.image_size, self.image_size, 3),
                                                      alpha=1.0,
                                                      include_top=False,
                                                      weights='imagenet')

        self.sl_units = sl_units
        self.head = tf.keras.Sequential([
            layers.Flatten(input_shape=(4, 4, 1280)),
            layers.Dense(
                units=sl_units,
                activation='relu',
                kernel_regularizer=l2(0.01),
                bias_regularizer=l2(0.01)),
        ])

        if hidden_layers != 0:
            last_n_layers = base_model.layers[-hidden_layers:]
            # Add the last n layers of MobileNet
            # for layer in last_n_layers:
            #     self.head.add(layer)
            
            # Add the last n layers of MobileNet
            for layer in last_n_layers:
                new_layer = layer.__class__.from_config(layer.get_config())
                self.head.add(new_layer)

        # Last softmax layer
        self.head.add(layers.Dense(
            units=50,  # Number of classes
            activation='softmax',
            kernel_regularizer=l2(0.01),
            bias_regularizer=l2(0.01)),
        )

        self.head.compile(optimizer=tf.keras.optimizers.SGD(lr=0.001), loss='sparse_categorical_crossentropy',
                          metrics=['accuracy']) 

@Nikolas_Stavrou,

Adding more hidden layers does not necessarily guarantee the increase in accuracy and it is purely experimental. However, you can try hyper parameter tuning to choose number of hidden layers optimal for your model.

Thank you!

What do you mean by hyper parameter tuning? I couldn’t get the shapes to match by using layers from MobileNetV2 so I just used a relu dense layer. I’ve only experimented with 1 2 and 3 hidden layers and it seems that the accuracy gets worse.

@Nikolas_Stavrou,

What do you mean by hyper parameter tuning?

The process of selecting the right set of hyperparameters for your machine learning application. For more details please refer to Introduction to the Keras Tuner  |  TensorFlow Core

We suggest you to follow the documentation on how to Create the base model from the pre-trained convnets and add head on top of the base model.

Thank you!

I’ve seen the site you recommended as well as another tutorial and made a few changes so ill use a few of the MobileNetV2 layers in my head and make them trainable. These are my new functions.

def buildBaseHidden(self,hidden_layers=0):
        self.base = tf.keras.applications.MobileNetV2(input_shape=(self.image_size, self.image_size, 3),
                                                      alpha=1.0,
                                                      include_top=False,
                                                      weights='imagenet')
        for l in self.base.layers:
            if ('_BN' in l.name):
                l.renorm = True

        self.base.trainable = False

        # Copying the last <hidden_layers> hidden layers from the base model to the head
        self.transfer_layers = []
        if hidden_layers > 0:
            for i in range(hidden_layers):
                layer = self.base.layers.pop()
                self.transfer_layers.append(layer)
            self.transfer_layers.reverse()

        inputs = tf.keras.Input(shape=(self.image_size, self.image_size, 3))
        f = inputs
        f_out = self.base(f)
        self.feature_extractor = tf.keras.Model(f, f_out)
        self.feature_extractor.compile(optimizer=tf.keras.optimizers.SGD(lr=0.001), loss='categorical_crossentropy',
                                       metrics=['accuracy'])
def buildHeadHidden(self, sl_units=32, hidden_layers=0):

        self.sl_units = sl_units
        self.head = tf.keras.Sequential([
            layers.Flatten(input_shape=(4, 4, 1280)),
            layers.Dense(
                units=sl_units,
                activation='relu',
                kernel_regularizer=l2(0.01),
                bias_regularizer=l2(0.01)),
        ])

        # Adding the hidden layers to the head
        for layer in self.transfer_layers:
            layer.trainable = True
            self.head.add(layer)

        # Last softmax layer
        self.head.add(layers.Dense(
            units=50,  # Number of classes
            activation='softmax',
            kernel_regularizer=l2(0.01),
            bias_regularizer=l2(0.01)),
        )

        self.head.compile(optimizer=tf.keras.optimizers.SGD(lr=0.001), loss='sparse_categorical_crossentropy',
                          metrics=['accuracy'])  

I’ve run the CoRE50 benchmark for different amount of hidden layers being trainable (0-5) but the accuracy remains the same more or less so I was wondering if maybe im doing something wrong. Shouldn’t we see either an increase or a decrease in accuracy?