Reducing the parameter size of LaBSE(language-agnostic BERT Sentence Embedding) for practical usage

jeongukjae · September 16, 2021, 2:37am

To get good quality language-agnostic sentence embeddings, LaBSE is a good choice. But due to the parameter size(Bert-base size, but #param is 471M), it is hard to fine-tune/deploy appropriately in a small GPU/machine.

So I applied the method of the paper “Load What You Need: Smaller Versions of Multilingual BERT” to get the smaller version of LaBSE, and I can reduce LaBSE’s parameters to 47% without a big performance drop using TF-hub and tensorflow/models.

GitHub: GitHub - jeongukjae/smaller-labse: Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE

Relative Links:

Language-agnostic BERT Sentence Embedding(LaBSE) (Paper: https://arxiv.org/abs/2007.01852, TF-hub:TensorFlow Hub)
Load What You Need: Smaller Versions of Multilingual BERT (Paper: https://arxiv.org/abs/2010.05609, GitHub: GitHub - Geotrend-research/smaller-transformers: Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.)

lgusm · September 16, 2021, 2:40pm

Nice work Jeon!!

Does the preprocessing model still works with your model? or is there still a need for it?
Did you think about publishing this to TFHub too?

jeongukjae · September 16, 2021, 3:19pm

Thank you

The preprocessing model is exported using the modified vocab file. So this model can be used with updated preprocessing model, not the original one. (You can check here make_smaller_labse.py#L37)

And I didn’t think about publishing this model, because I didn’t train the model, just patched it. Is it okay to publish?

lgusm · September 16, 2021, 4:36pm

Yes I think you should!!

Of course, mention the base model on the description and all.
I’d also publish the updated preprocessing model too to keep consistency.

jeongukjae · September 17, 2021, 3:22am

Oh, then I will check the docs and send a PR publishing this model!

lgusm · September 17, 2021, 9:49am

This is perfect!! Thanks!

Keep me posted, I’d love to try it on this Colab: Classify text with BERT | Text | TensorFlow

jeongukjae · September 19, 2021, 11:13am

Thanks!! I created a PR to upload this model.

lgusm · September 20, 2021, 10:30am

Very good!! thanks for contributing to the community!

lgusm · September 20, 2021, 1:33pm

and it’s live: TensorFlow Hub

well done!!

I’d update the documentation to link to your preprocessing too!
Great work!