I am trying to move a couple of hidden layers from the MobileNetV2 and add them to my head of my model and make them trainable in an attempt to increase my accuracy when running a benchmark (CORe50 NICv2 - 391 benchmark).
I wrote my code to do so but running the benchmark and graphing my results it seems that with different hidden layers moved I get no difference in accuracy at all (nor good nor bad) so I was wondering if I’m doing something wrong.
This is the code part that I build the base of the model and you can see that I remove (truncate) the last N hidden layers given from the parameter.
def buildBaseHidden(self,hidden_layers=0): baseModel = tf.keras.applications.MobileNetV2(input_shape=(self.image_size, self.image_size, 3), alpha=1.0, include_top=False, weights='imagenet') # Batch normalization layers replaced with bactch renormalization layers - Better for CL for l in baseModel.layers: if ('_BN' in l.name): l.renorm = True baseModel.trainable = False base_model_truncated = tf.keras.Model(inputs=baseModel.input, outputs=baseModel.layers[-hidden_layers-1].output) self.base = base_model_truncated inputs = tf.keras.Input(shape=(self.image_size, self.image_size, 3)) f = inputs f_out = self.base(f) self.feature_extractor = tf.keras.Model(f, f_out) self.feature_extractor.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
You can also see below my building of the head of the model where I add those N amount of hidden layers from the MobileNetV2 and make them trainable right before the flattening layer.
def buildHeadHidden(self, sl_units=32, hidden_layers=0): baseModel = tf.keras.applications.MobileNetV2(input_shape=(self.image_size, self.image_size, 3), alpha=1.0, include_top=False, weights='imagenet') self.sl_units = sl_units # Create a new head model self.head = tf.keras.Sequential() # Add the last N layers of MobileNetV2 to the head model for i in range(-hidden_layers, 0): layer = baseModel.layers[i] layer.trainable = True self.head.add(layer) self.head.add(layers.Flatten(input_shape=(4, 4, 1280))) # Removed the dense layer since we add Hidden Layers from MobileNetV2 now self.head.add(layers.Dense( units=sl_units, activation='relu', kernel_regularizer=l2(0.01), bias_regularizer=l2(0.01) )) # Softmax layer (Last layer) self.head.add(layers.Dense( units=50, # Number of classes activation='softmax', kernel_regularizer=l2(0.01), bias_regularizer=l2(0.01)), ) self.head.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=0.001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
Lastly, I’m printing the body and the head of my model to make sure that the layers have moved and they are trainable as you can see below (not seeing the whole body layers since its all the layers of the MobileNetV2 minus the last few N ones that we moved, in this example below we moved 3 layers)
...a lot of MobileNetV2 layers... block_16_expand: False block_16_expand_BN: False block_16_expand_relu: False block_16_depthwise: False block_16_depthwise_BN: False block_16_depthwise_relu: False block_16_project: False block_16_project_BN: False Head model trainable status: Conv_1: True Conv_1_bn: True out_relu: True flatten: True dense: True dense_1: True Complete model trainable status: input_4: True model: True sequential: True
Any help or feedback is appreciated.
Thanks in advance,