Is there an Android equivalent to the Apple Word Tagger Model?

On the iOS version of our app we are using the Apple Word Tagger Model so that it can assist us with parsing some complex sentences and that way we can extract the parts that we actually want. It is working fairly well.
I need something similar on Android.

Basically I would feed it a large sample of sentences and a classification for every word in the sentence and once it learns those I can pass it some new sentences and it’ll tell me the classification of each of the words.

A similar example to what we want would be an address parser. Imagine that the training data looks like:

123, house_number
E., direction_short
Main, street_name
Av., avenue
17123, house_number
W. direction_short
West, direction_long
Ct., court
10th, street_name
etc, etc

And then we pass it “87 W. Central St.”, then the word tagger would tell us 87 is house_number, W. is direction_short, etc.

I tried using the Bert Classifier and the NLClassifier with my own model but when I get the classifier results back it is analyzing the entire “87 W. Central St.” and telling me the score for each possible label, at least I think that is what is doing.

I need a model that will tell me the category (label) of each word in my sentence. Is there such a thing for Android?

Hope that makes sense.

Thanks.

1 Like

If you are looking for something canned like Apple WTM you can take a look at MLKIT:

1 Like

This looks super close to what I need but my data is pretty custom data. Normal words but I need to categorize them in a way specific to my app. Is there something like this but that can be trained with my own data and categories?

Thanks.

1 Like

If you want something near production ready with your custom data you can try to explore:

https://github.com/legacyai/tf-transformers/blob/main/src/tf_transformers/notebooks/tutorials/ner_albert.ipynb

1 Like

The good thing about training a tagger model based on ALBERT is it will likely be supported in TFLite. TensorFlow Hub also provides MobileBERT checkpoints, another model supported in TFLite.

Tagging @lgusm for more details.

1 Like

I have been working on launching MobileBERT on Android, but haven’t succeeded yet. The issue is input tensors should be in INT32 format (at least ML binding and description from TF Hub tells me this), but TensorBuffer under the hood supports only UINT8 and FLOAT32. What was a point behind preparing MobileBERT, but leave input/output tensor formats are not supported on a mobile device?