Reducing the parameter size of LaBSE(language-agnostic BERT Sentence Embedding) for practical usage

To get good quality language-agnostic sentence embeddings, LaBSE is a good choice. But due to the parameter size(Bert-base size, but #param is 471M), it is hard to fine-tune/deploy appropriately in a small GPU/machine.

So I applied the method of the paper “Load What You Need: Smaller Versions of Multilingual BERT” to get the smaller version of LaBSE, and I can reduce LaBSE’s parameters to 47% without a big performance drop using TF-hub and tensorflow/models.


Relative Links:

Nice work Jeon!!

Does the preprocessing model still works with your model? or is there still a need for it?
Did you think about publishing this to TFHub too?

1 Like

Thank you :slight_smile:

The preprocessing model is exported using the modified vocab file. So this model can be used with updated preprocessing model, not the original one. (You can check here

And I didn’t think about publishing this model, because I didn’t train the model, just patched it. Is it okay to publish?

Yes I think you should!!

Of course, mention the base model on the description and all.
I’d also publish the updated preprocessing model too to keep consistency.


Oh, then I will check the docs and send a PR publishing this model!


This is perfect!! Thanks!

Keep me posted, I’d love to try it on this Colab: Classify text with BERT  |  Text  |  TensorFlow

1 Like

Thanks!! I created a PR to upload this model.


Very good!! thanks for contributing to the community!

1 Like

and it’s live: TensorFlow Hub

well done!!

I’d update the documentation to link to your preprocessing too!
Great work!