How do i get probabilities prediction from keras?

I’m pretty new to ML.
For a school project, I try to optimize the tpr of my confusion matrix, but i use a Sequential model from keras and they don’t have the predict_proba of scikitlearn. How could i have the probabilities to modify the threshold ?

def perceptron(X,y):

  X_train,X_test,y_train,y_test = train_test_split(X, y, test_size=0.2, random_state=42)

  model = km.Sequential()
  model.add(kl.Dense(30, activation='sigmoid' , input_dim=X.shape[1]))
  model.add(kl.Dense(30, activation='sigmoid'))
  model.add(kl.Dense(8, activation='sigmoid'))
  model.add(kl.Dense(5, activation='sigmoid'))
  model.add(kl.Dense(1, activation='sigmoid'))
  model.compile(optimizer='Adam', loss='mae')

  history =, y_train,
          verbose=0) # % of data being used for val_loss evaluation

  ev = model.evaluate(X_test, y_test)
  return y_test,X_test,history,model
def classification(y_test,X_test,history,model,aff):

    y_pred = model.predict(X_test)

    for i in range(len(y_pred)):
      if y_pred[i]<0.5:
      if y_pred[i]>0.5:

    threshold = 0.5

    probabilities = tf.nn.softmax(X_test).numpy()
    pred = np.argmax(probabilities, axis=1)


    class_report = classification_report(y_test, y_pred) # Additional evaluation metrics

    # Calculate confusion matrix
    conf_matrix = confusion_matrix(y_test, y_pred,labels=[0,1])

Instead of the y_pred = model.predict(X_test), what can i do ?


Your current code applies sigmoid activation to the dense layers. (Maybe you should use relu activation for all but the last layer?) The last layer in the model outputs a single value between 0 and 1, which you can treat as a probability for binary classification. model.predict should therefore output a probability for each input. I’m not sure what you’re trying to do with tf.nn.softmax or if it’s necessary.

Thank you for your reply,
Why choose relu rather than sigmoid ?

(Don’t pay attention to the line tf.nn.softmax, i forgot to # it :sweat_smile:)

Btw, I had forgotten that predict already gives the probability for a binary classification, thanks for reminding me!

The convention is to use relu activation for hidden layers (between the input layer and the final layer) in the neural network and sigmoid in the final layer to output a probability. The neural network learns better when the hidden layers use relu activation.