Using model.predict and interpreting results

Hi all,

I am learning TF and have created a model to classify data (values coming from sensors) and my targets are types of events. It has 6 inputs and 5 outputs
As my targets are 5 categories, I have used on-hot encoding so I ended up with 5 possible values (0-4).
I have trained and saved my model. So far so good.
Now I wanted to try and make a precision with it (that’s usually where available training lacks in my view as it usually falls short of actually using the model you created!).

So I created an array of values mimicking my sensor data. I scaled it the same way I did with my training data (using sklearn preprocessing.scale).
Now when I run model.predict(example_data), I get:
array([[2.0519667e-05, 3.8042336e-03, 9.1781759e-01, 1.2774050e-04,
7.8229807e-02]], dtype=float32)

So I guess each array value represents the probability of being one of my target categories of 0-4. is that correct? How do I interpret the result back to the target categories 0, 1, ,2,3. or 4?

Many thanks

Hi @Lars the nature of your model output will depend on your model, more specifically the last layer of your neural network and its activation function.
Please share minimal reproducible code.
Thank you.

Thanks @tagoma

See below. Again this is based on a training course model I have adapted slightly to fit my data.

t# Set the input and output sizes
input_size = 6 #6 data points as inputs
output_size = 5 #5 possible outputs/categories of events
# Use same hidden layer size for both hidden layers. Not a necessity.
hidden_layer_size = 50
# define how the model will look like
model = tf.keras.Sequential([
    # tf.keras.layers.Dense is basically implementing: output = activation(dot(input, weight) + bias)
    # it takes several arguments, but the most important ones for us are the hidden_layer_size and the activation function
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 1st hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer
    # the final layer is no different, we just make sure to activate it with softmax
    tf.keras.layers.Dense(output_size, activation='softmax') # output layer

### Choose the optimizer and the loss function

# we define the optimizer we'd like to use, 
# the loss function, 
# and the metrics we are interested in obtaining at each iteration
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
#model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

### Training
# That's where we train the model we have built.

# set the batch size
batch_size = 100

# set a maximum number of training epochs
max_epochs = 100

early_stopping = tf.keras.callbacks.EarlyStopping(patience=2) # early stopping mechanism

# fit the model
# note that this time the train, validation and test data are not iterable, # train inputs
          train_targets, # train targets
          batch_size=batch_size, # batch size
          epochs=max_epochs, # epochs that we will train for (assuming early stopping doesn't kick in)
          callbacks = [early_stopping],
          validation_data=(validation_inputs, validation_targets), # validation data
          verbose = 2 # making sure we get enough information about the training process

The output layer of your classification model:

tf.keras.layers.Dense(output_size, activation=‘softmax’)

It will then spit out values between 0.0 and 1.0 that you shall interpret as probabilities. And you shall notice the probabilities of your different classes add up to 1.0 (it doesn’t seem to be the case here, though).

Please read though the excellent Machine Learning course by Google, here.

Thanks. Yes that’s my understanding of softmax function so I should get something like [0.1, 0.1, 0.1,0.6, 0.1].
I tried it with the example in the course (which was based on an audio book data set and has 2 output values) and I get similar values types of values:

predictions = model.predict(example_data)
array([[2.294427e-10, 1.000000e+00]], dtype=float32)

so I am wondering if its not related to the representation of the output values, i.e does it need to be converted to a different format?

I’m not sure I understand what you mean. But I would expect the call to the predict method returns a list the length of which is the total number of classes (5 in your example, based on your earlier comments)