I am having trouble understanding the behaviour of the BinaryCrossentropy loss function in keras in the evaluate method in certain situations …

Here’s my minimal reproducer:

from tensorflow import keras
import numpy as np
loss_func = keras.losses.BinaryCrossentropy()
nn = keras.Sequential([
keras.layers.Dense(2**8, input_shape=(1,), activation='relu'),
keras.layers.Dense(2,activation='softmax')
])
nn.compile(loss=loss_func,optimizer='adam')
train_x = np.array([0.4]) # this is an arbitrary input
train_y = np.array([[1.,0.]])
train_q = nn.predict(train_x)
print("train_q = ",train_q)
print("Evaluated loss = ",nn.evaluate(train_x,train_y))
print("Function loss = ",loss_func(train_y,train_q).numpy())
print("Manual loss = ", -np.log(train_q[0,0]) )

Yielding the following output:

train_q = [[0.5108817 0.48911828]]
1/1 [==============================] - 0s 438ms/step - loss: 0.6823
Evaluated loss = 0.682330846786499
Function loss = 0.671617
Manual loss = 0.67161715

The function loss makes complete sense, it equals the loss value I calculate ‘manually’ (by hand). What doesn’t make sense is the loss calculated in the evaluate call. How did it get 0.68233 here?

Hi. Thanks for the suggestion. I just tried it but both of those alternative yield the same output as nn.predict … so it still seems like nn.evaluate is doing something wrong here … it’s like there’s a bug in nn.evaluate ???

# Seed value
# Apparently you may use different seed values at each stage
seed_value= 0
# 1. Set the `PYTHONHASHSEED` environment variable at a fixed value
import os
os.environ['PYTHONHASHSEED']=str(seed_value)
# 2. Set the `python` built-in pseudo-random generator at a fixed value
import random
random.seed(seed_value)
# 3. Set the `numpy` pseudo-random generator at a fixed value
import numpy as np
np.random.seed(seed_value)
# 4. Set the `tensorflow` pseudo-random generator at a fixed value
import tensorflow as tf
from tensorflow import keras
loss_func = keras.losses.BinaryCrossentropy()
nn = keras.Sequential([
keras.layers.Dense(2**8, input_shape=(1,), activation='relu', kernel_initializer=keras.initializers.GlorotUniform(seed=1)),
keras.layers.Dense(2,activation='softmax', kernel_initializer=keras.initializers.GlorotUniform(seed=1))
])
nn.compile(loss=loss_func,optimizer='adam')
train_x = np.array([0.4]) # this is an arbitrary input
train_y = np.array([[1.,0.]])
train_q = nn.predict_step(train_x)
print("train_q = ",train_q)
print("Evaluated loss = ",nn.evaluate(train_x,train_y))
print("Function loss = ",loss_func(train_y,train_q).numpy())