How to extract body of a transformer like models and fine tune with that body on different data

Arjun_Reddy · June 5, 2023, 3:34am

In BERT like transformer model (I am not talking about BERT in this thread), it has 2 training objectives Masked Language Modeling and Next sentence prediction right? and BERT model is also supports different input shapes, So I am actually building a model with 2 training objectives on a base model and those 2 training objectives are Denoising data on time-series and Triplet loss on time sereis and just want to take the base model body and fine tune the model on different data with different shape in TensorFlow, How is code written for this in tensorflow at low-level and what I mean is extract that body after pre-training and then add new inputs and new outputs to this base model and fine-tune whole model on some dataset with different shape.

Base Model Arcitechture:

inputs=layers.Input((3000,7))
x = layers.Conv1D(32, 3, activation=tf.nn.leaky_relu,padding='same')(inputs)
x = layers.MaxPooling1D(2)(x)
x = layers.MultiHeadAttention(num_heads=4, key_dim=2)(x, x)

x = layers.BatchNormalization()(x)

x = layers.Conv1D(64, 3, activation=tf.nn.leaky_relu,padding='same')(x)
x = layers.MaxPooling1D(2)(x)
x = layers.MultiHeadAttention(num_heads=4, key_dim=2)(x, x)

x = layers.GlobalAveragePooling1D()(x)
x = layers.Dense(512, activation='relu')(x)
x = layers.Dense(256, activation='relu')(x)
outputs = layers.Dense(128, activation='relu')(x)


base_model = keras.Model(inputs=inputs, outputs=x)

Pre-training Base model:

input_denoising = keras.Input(shape=(x_unlabelled.shape[1], x_unlabelled.shape[2]))
x = base_model(input_denoising)
x=layers.Dense(3000*7,activation=tf.nn.leaky_relu)(x)
output_denoising=layers.Reshape((3000, 7))(x)

input_1 = keras.Input(shape=(x_unlabelled.shape[1], x_unlabelled.shape[2]))
input_2 = keras.Input(shape=(x_unlabelled.shape[1], x_unlabelled.shape[2]))
input_3 = keras.Input(shape=(x_unlabelled.shape[1], x_unlabelled.shape[2]))

output_1 = base_model(input_1)
output_2 = base_model(input_2)
output_3 = base_model(input_3)

concatenated = layers.concatenate([output_1, output_2, output_3])
output = layers.Dense(2, activation='softmax')(concatenated)


combined_model=keras.Model(inputs=[input_denoising,input_1, input_2, input_3],outputs=[output_denoising,output])
combined_model.compile(optimizer='adam',loss=['mse','categorical_crossentropy'])

combined_model.summary()

So, now I want to remove all the input and outputs and add new inputs and outputs aligned with new data and fine-tune on that dataset, so how can I do this

Laxma_Reddy_Patlolla · June 5, 2023, 6:21pm

Hi @Arjun_Reddy,

To fine-tune a pre-trained model in TensorFlow, you can load the pre-trained weights, modify the input and output layers, optionally freeze the base model’s weights, add new layers for your specific task, and compile and train the fine-tuned model.

Here is a more detailed explanation of the steps:

Load the pre-trained weights. This will initialize the model with the learned representations from pre-training.
Modify the input and output layers: Remove the existing input and output layers from the base model and replace them with the new input and output layers aligned with your new data.
Freeze base model layers (optional): Depending on your requirements, you may choose to freeze the weights of the base model to prevent them from being updated during fine-tuning.
Add new layers for your specific task:Add any additional layers necessary for your specific task on top of the base model’s outputs.
Compile and train the fine-tuned model: compile the model with an appropriate optimizer and loss function for your task, and then train it on your new dataset. Make sure to replace new_data_shape with the shape of your new input data and num_classes with the appropriate number of output classes for your specific task.

By following these steps, you can extract the body of the pre-trained base model, add new inputs and outputs, and fine-tune the whole model on a new dataset with different shapes in TensorFlow.

I hope this helps!

Thanks.

Arjun_Reddy · June 5, 2023, 6:25pm

Thanks a lot @Laxma_Reddy_Patlolla , my ultimate goal is what you have mentioned all, but I want to know, how to implement in code all the steps you have mentioned and it would be a great help, if you could give guidance to remove input and output layers with respective to my code above mentioned with a base_model and combined model, let’s say I want to remove all inputs and add new_input with shape (3000,13) and at last remove all outputs and add 1 sigmoid layer as output at last.

Thanks a lot again for your time and consideration