Bert/Bert Based Pre-processing with TFLite issues

Sid · August 30, 2021, 7:03pm

Hi there everyone, Pre-processing with BERT based models has been the biggest cause of pain for me while working on my project (Android), currently I am using sentencepiece’s C++ library (I was already using NDK for other things) and it works as I expected, kinda ! But I would like to figure out if there is a way to integrate it with the tflite model itself because it’s a lot of code removed if that could be done.

This is one of the Notebooks I was experimenting on before doing it the C++ way, it takes less than 5 minutes to run the entire notebook (pretty small) and make the tflite embedding model and to reach the error when you run the tflite model.

interpreter.invoke()
RuntimeError Traceback (most recent call last)
in ()
----> 1 interpreter.invoke()
/usr/local/lib/python3.7/dist-packages/tensorflow/lite/python/interpreter.py in invoke(self)
873 “”"
874 self._ensure_safe()
→ 875 self._interpreter.Invoke()
876
877 def reset_all_variables(self):
RuntimeError: Table not initialized.
(while executing ‘WordpieceTokenizeWithOffsets’ via Eager)Node number 2167 (TfLiteFlexDelegate) failed to invoke.

I can assume that there is some kind of table being used in pre-processing model which isn’t being initialized in the tflite version for the model when loaded into interpreter or during allocation. But I have no idea how to go about debugging it.

cpoohee · September 2, 2021, 12:31am

I have encountered similar issues with converting bert to tflite in the past running on tf v2.5.

Firstly, the vocab.txt was not able to initialise the hashtable in tflite. That causes your runtime error of table not initialized. I’m not sure with the latest tf v2.6, using converter.experimental_enable_resource_variables = True during tflite conversion will improve the situation now?

Secondly, if you are using preprocessor = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3") or similar tokenizer helper layers that depends on tensorflow-text, you will have difficulties compiling mobile tflite binaries that support tensorflow-text ops as flex delegate ops. See Using tensorflow_text with tflite. · Issue #287 · tensorflow/text · GitHub

I ended up having to rewrite the vocab.txt to load into tflite friendly tensors to by-pass the initialization issue. Also, rewriting parts of bert tokenizers on mobile or rewriting tensorflow-text ops to tflite friendly ops. It can work, but you will likely end up with spaghetti codes, with some parts either lives on tflite models, or android. If you are already using c++ ndk sentence piece, it might be much neater to remain that way until the above issues are fixed.

Sid · September 2, 2021, 6:34am

@cpoohee Thanks for the reply, while I was experimenting yesterday, I ended up writing the pre-processing model using tensorflow text, but it still had the same problem my guess is this as you mentioned. Also I don’t think experimental_enable_resource_variable helped not that I expected it to but still

I was wondering which custom ops that tensorflow text uses you replaced as there are quite a few of them, it would save me time for me to identify which ops work and which don’t for tflite because currently I don’t even know how to identify them, I need to research that first.

This might be true as I don’t know the level of jank I need to pull off to offload this to the model lol. But currently I am trying to eliminate usage of ndk as it inflates project complexity wayyyy too much than I would like as I am shifting to flutter, I removed any other use case that can be made in c++ and wrote them natively, only part left is this.

Sid · September 2, 2021, 7:31am

How to link TensorFlow_Text ops into Android/iOS binaries?

E/tflite  (27736): Op type not registered 'CaseFoldUTF8' in binary running on localhost. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
E/tflite  (27736): Op type not registered 'RegexSplitWithOffsets' in binary running on localhost. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.

Oh, looking at the issue/thread you made, I assume these two are it.

https://tensorflow-prod.ospodiscourse.com/t/how-to-link-tensorflow-text-ops-into-android-ios-binaries/3241/16?u=sid
thanks for you research into the topic it seems like a lot of work either my c++ sentence piece way or your way. Now I have a choice to make about really moving the project to flutter, considering the amount of effort, it would take for making to work on ios with flutter, might as well make another native app for ios.

lgusm · September 2, 2021, 11:26am

Hi everyone.

Thanks for the feedback.
I think we need to improve the preprocessing TFLite conversion and the whole BERT story for mobile.

For now now I’d suggest you look into these two BERT models already converted: TensorFlow Hub

If you need some fine tuning, Model Maker can help you: Text classification with TensorFlow Lite Model Maker

Again, this is not optimal. I’ll try to bring this up with the specific teams.

lgusm · September 2, 2021, 11:50am

cc: @battery for some insights

battery · September 2, 2021, 12:37pm

It would be better to try the recent TF version, eg., TF 2.6 or tf-nightly version.

And to run tf.text ops, you need to link the TF text op library to your application. TFLite Select TF option will find the registered TF text operators.

cpoohee · September 2, 2021, 2:07pm

The TF text ops can be converted with SELECT_TF_OPS to TFlite. They are running fine on python built for non-mobile but not for mobile. I had difficulties compiling custom aar binaries using the bash tensorflow/lite/tools/build_aar.sh script. The script doesn’t recognize tensorflow-text ops and fails to link text ops. Monolithic builds doesn’t include text-ops too. I haven’t tried custom building iOS binaries but I suspect it doesn’t work too. I’m not sure there exists some kind of flags for the binaries to link well. So far, there is no working examples i could find.

cpoohee · September 2, 2021, 3:27pm

@Sid ,
Well, I try my best to describe my current method, it’s probably not optimal. I hope you can have a better gauge on the effort needed compared to c++ ndk sentencepiece.

I was using this and this as references for Bert tokenizer.

Based on what i observe, Bert Tokenizer consists of 2 general steps which are basic tokenizer followed by wordpiece tokenizer. Basic tokenizer deals with stripping whitespace, casefolds, splitting special characters such as punctuations and Chinese characters. This is followed by Wordpiece tokenizer which takes the preprocessed splits from previous basic tokenize step. It converts the split words and convert to integer tokens in reference with the vocab.txt dictionary. It also deals with possible multiple integer tokens per word since this is the wordpiece method.

OPs or methods that needed to be modified or offload to mobile are(the ops name may not be exact):

Basic tokenizer:

case_fold_utf8
normalize_utf8
regex_split_with_offsets

Wordpiece tokenizer:

wordpiece_tokenize_with_offsets
vocab_lookup_table

For my case, i rewrote most of the basic tokeniser steps in flutter. There are casefolds and NFD normalization libraries around in flutter. As for regex, the expression on original python code is based on perl, while the flutter RegEx library is based on JS. The regex syntax conversion is a little ugly. I used this tool to help me. You can see a simple punctuation regex in perl results in a long string of expressions for JS. Once done splitting, I recombined it with simple space separated strings and sent it to the tfmodel. Here, the standard tf string ops split is able to handle space separated strings without relying on tf-text regex ops.

For Wordpiece tokenizer, the main issue was dealing with the failed initialization of lookup table. I have reimplement the lookup table into a standard tensor of string elements. During tflite conversion, the tensor based dictionary are already loaded as constant values and converted accordingly. Those methods that involved accessing the lookup table were modified to look for strings in the tensor instead. This is especially convoluted for wordpiece_tokenize_with_offsets. I have to based off the google-research bert tokenization.py python code and reimplement in tf style.

Lastly, the current flutter tflite libraries doesn’t have support for text input/output and lacks good support to select ops too. I have to do modifications to the existing flutter libraries to work with strings and select ops for my case. See my fork if you want to try. I have only done modifications to suit my needs and I don’t guarantee it will work for all cases. I am able to make it work for both ios and android simulators, but I am still having issues such as resorting to large monolithic binaries. With so many modifications to make it work, i highly suspect i have hidden bugs along the way too. Frankly, I am still a noobie to flutter. I only using it because the developer I am helping uses flutter to develop apps.

Sid · September 2, 2021, 4:32pm

Thanks for the detailed explanation @cpoohee I appreciate the time you took to explain the method, I got the gist of it from your last issue on how to go about doing it but this makes it much clearer. I am still a noob-ish in terms of depth of knowledge I have in ML.

Going to flutter makes it much more difficult (if i think to use sentence piece C++ lib) as now I have to go from dart to java (probably using a method channel) then from java to C++ using JNI bridge (I kinda have some of this setup but it will require more code), look like I may be able to skip some of it with dart ffi but I got no experience with it.
But for now I will keep the pre-processing and model separated as it seems a lot of work to integrate it right now.

lgusm · September 3, 2021, 9:46am

Sid,

It came to my attention that for on-device you may skip the preprocessing and Task library. It provided BERT NL classifier and BERT QA for that.

It might make your life much easier.

Sid · September 3, 2021, 4:02pm

Thanks for the resource links, unfortunately my application usage is summarization @lgusm

Integrate BERT natural language classifier | TensorFlow Lite is what made me do the current sentence piece through C++ Api, I went the manual way because in my brief flipping through source codes, it didn’t seem to have any public functions regarding just the pre-processing (if my memory serves correctly), Also, I am pretty sure they use out-of graph pre-processing just like my implementation anyways so, it seemed like I am not losing any performance improvements.

My main motive for me to start the thread was because I started to port my application to flutter, and I wanted to move the pre-processing complexity to model side. But that seems very unstable at the moment as saw @cpoohee’s story with my own tinkering (hope this recieves love from the tflite devs). Currently I am exploring the option of using dart ffi to wrap sentence piece’s C++ api as this way worked before I started moving to flutter, if I succeed (it seems very promising from today’s testings), I will try to publish this to pub.dev so others can use it if they require.