LSTM only yielding outputs between 0-1

Adam · August 5, 2021, 2:15am

I’m fairly new to training neural networks, and I’ve only made one LSTM before for text generation. I’m making a model that learns to mimic a guitar effect (like reverb), using the raw tone leading up to instantaneous sample as input and comparing it to the effected version of the same raw sample at that instant at the end of model. Looking at the raw input and the contrasting effect sample, they look like [x1, x2], with numbers between (-27, 27) or so, yet the output that I’m getting after training the model has numbers between (0,1). Are these probabilities? What am I missing? Why doesn’t it look like the data I’m optimizing it with? My code is below: x (raw input) is 3 dimensional with 2 numbers for each audio instance, 100 instances leading up to the current one, and as many samples of this sort to train. The y (effected) is 2 dimensional for the two numbers that the model should eventually predict from the (100,2) input each time. I hope my concerns are clear and please let me know if someone can help me understand why I’m not getting the output I want (between (-27, 27).

x = numpy.reshape(data_x, (int(len(raw) - seq_length, seq_length, 2))

y = numpy.reshape(data_y, (int(len(raw) - seq_length, 2))


# LSTM model
model = Sequential()
model.add(LSTM(256, input_shape=(x.shape[1], x.shape[2]), return_sequences=True))
model.add(Dropout(0.4))
model.add(LSTM(256))
model.add(Dropout(0.4))
model.add(Dense(2, activation='softmax'))

# load weights
#model.load_weights("/Users/adampowley/Dido/dido4-02-26.1336.hdf5")
model.compile(loss='categorical_crossentropy', optimizer='adam')

checkpoint = ModelCheckpoint('dido4-{epoch:02d}-{loss:.4f}.hdf5', monitor='loss',
                             verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]
history = model.fit(x, y, epochs=30, batch_size=20, callbacks=callbacks_list)

TimoKer · August 6, 2021, 9:00am

Hi there!
That’s a nice application of LSTMs. Now, for your question, I think it’s probably due to the fact that you use the softmax activation function in your output layer. This functions always returns a number between 0 and 1 (you can look up the graph). Generally, the output of a softmax activation can indeed be seen as a probability for the corresponding class in a multi-class problem.

I hope this sets you up looking in the right direction

Andreas_Beschorner · August 6, 2021, 9:36am

Greetings Adam,
as TimoKer already mentioned, softmax maps real numbers to the interval (0, 1). In general, I suggest you at least get basic knowledge about potential activation functions as otherwise you won’t understand what you are doing or might chose a rather non optimal method for whatever kind of layer.
BR,
Andy