How to create embeddings for text data in tensorflow and how to pass it to the neural network model

I am working on deep learning based recommendation system movielens100K dataset in that i have features such as user id, movieid, ratings ,title, genre I would like to know how can convert the title of movie or genre of the movie in to embeddings and at input layer I would be concatenating all the embedding vectors

Have you already checked these two tutorials?

The following worked for me

unique_title_ds = <dataset of unique titles>
max_tokens = 10_000
embedding_dimension = 32

self.title_vectorizer = tf.keras.layers.TextVectorization(max_tokens = max_tokens)
self.title_text_embedding = tf.keras.Sequential([
  self.title_vectorizer,
  tf.keras.layers.Embedding(max_tokens, embedding_dimension, mask_zero = True),
  tf.keras.layers.GlobalAveragePooling1D(),
])

self.title_vectorizer.adapt(unique_title_ds)

And invoking my item model concatenates the item embedding with the title embedding:

  def call(self, items):
    return tf.concat([
        self.item_embedding(items),
        self.title_text_embedding(items),
    ], axis = 1)

Hi if we consider only one genre for a movie can we use label encoding instead of one hot encoding

1 Like