LSTM and SimpleRNN require floats32?

Ori_Raisfeld · January 25, 2024, 11:16am

i am very, very new to tensorflow. i’m trying to make a model that implements StringLookup, and i decided to try and use either LSTM or SimpleRNN. but after creating this simple model:

def categorical_model(data, vocab):
  lookup = layers.StringLookup(vocabulary=vocab, output_mode="int")
  reverse_lookup = layers.StringLookup(vocabulary=vocab, invert=True)
  model = tf.keras.Sequential()
  model.add(lookup)
  # i wrote LSTM this time, but the same error happens with SimpleRNN
  model.add(layers.LSTM(10, activation='linear', input_shape=(-1, 3, 1)))
  model.compile(optimizer='adam', loss='mse')
  model.evaluate(data)
  return model, lookup, reverse_lookup

however, when trying to insert input into the model i get this error:
Input 'b' of 'MatMul' Op has type float32 that does not match type int64 of argument 'a'.
the docs showed how LSTM can be used on positive integers to return positive integers, but for some reason it seems to require me to use float32? can someone please explain to me why can’t i use LSTM and SimpleRNN on integers

Dennis · January 27, 2024, 10:02am

Hi @Ori_Raisfeld ,

there’s a very helpful TF Tutorial, since it sound’s like you’re working with Text:

Maybe a TextVectorization and Embedding layer before your LSTM (see Tutorial above) fits better for your needs.

Let me know and welcome to the forum ,
Dennis

Ori_Raisfeld · January 27, 2024, 2:02pm

thanks, i actually used it before and i realized that it’s not what i need for this project. text vectorization is for tokenizing strings, which i don’t need since my data is categorical, not text. and embedding layers are for grouping similer values, which i also don’t need since there’s no point in grouping categorical data. so thanks for the suggestion, but it’s not what i need for this project

Ori_Raisfeld · January 27, 2024, 2:10pm

i searched it up for a while, and the sources i found were not very trustworthy. but according to what i found, LSTM can return an approximation of the result inside the range of integers at the output, and since it’s an approximation, you can just use layers.Dense() or np.round() depending on the type of output you receive. this wouldn’t work untill you train the model, since the model requires a range to begin with, so create the model and train it before evaluating/predicting