Questions about the fine-tuning BERT

REal_Slim · April 25, 2022, 1:14am

Hi,

I am new to deep learning and NLP area. I have tried to follow the tutorial from the tensorflow website.
But I found one tutorial is called fine-tuning BERT, which is use the BertTokenizer to do the classification stuff. And I found another tutorial called using BERT to do text classification, which just import prepressing layer and encoding layer to encode the text.

I am little confused about the difference between these two tutorials. The fine-tunning seems a lot harder. Does the fine-tuning approach mean that it can be applied to customed data?

Thanks

Laxma_Reddy_Patlolla · January 17, 2024, 7:17pm

Hi @REal_Slim,

The distinction between the two approaches:

Using BERT for Text Classification:

This approach involves using a pre-trained BERT model as a feature extractor. The pre-processing layer and encoding layer are utilized to tokenize and encode the input text. You can then add your custom classification layers (dense layers) on top of the BERT encoding to train a model for your specific task.
This method is simpler because the BERT model’s weights are frozen during training, and only the weights of the additional layers are learned on your specific task. It’s more like transfer learning, leveraging the knowledge BERT gained from a large corpus for your task.

Fine-tuning BERT:

Fine-tuning involves taking a pre-trained BERT model and training it further on your specific task with your custom dataset. This means you adjust the weights of the entire BERT model (or a substantial part of it) during training.
Fine-tuning is more complex and requires more computational resources. It’s typically done when you have a specific task or domain that is substantially different from the pre-training data. For example, if you have a specialized dataset or a different language.

I hope this helps.

Thanks.