Tokenizer vs TextVectorization

I was wondering what the difference between these two classes were? Can I just directly use TextVectorization in place of using the Tokenizer?


Both have different purposes and use cases. If you’re building an end-to-end deep learning model for a specific NLP task, TextVectorization is usually more convenient because it handles the tokenization and vectorization in one step and can be easily integrated into your model as a layer.

Where as Tokenizer may be more appropriate if you require more control over the tokenization process.

Thank you!