Which solution work best for this case

Hello all,
I have a question about which solution will fits this case the best:

Case:
I have 3 kinds of text blocks: ingredients, preparation and dosage, and want to classify this types
I have a lot data that are already categorized for training

I hope someone have some papers, GitHub links or even better experience for this case.

best regards !!

1 Like

You can start with this tutorial: Classify structured data with feature columns  |  TensorFlow Core
It demonstrates how to deal with various data types.

1 Like

I’m sorry I don’t mean data types i want to classify if a text is a ingredient statement preparation statement or dosage statement

It seems that your problem is just a 3 classes classification task on a single receipit step.
But more in general on this specific topic I suggest to take a look at cooking receipts NER applied to the RecipeDB dataset as it is more interesting:

https://arxiv.org/abs/2004.12184

1 Like

Maybe experiment with Transformers, which are arguably the most advanced types of architectures for quite a few domains.

Some ideas:

+1 (note that Keras preprocessing layers and/or TF Text API may be more recommended cc @markdaoust)

Pre-processing the dataset and creating a pipeline could be a challenging task especially with a custom dataset. How about also: BERT Preprocessing with TF Text  |  TensorFlow and Text classification with an RNN  |  TensorFlow (if you want to try a less complex model first).