How is BinaryCrossentropy loss calculated in keras fit/evaluate?

will-cern · August 11, 2022, 8:50am

Hello,

I am having trouble understanding the behaviour of the BinaryCrossentropy loss function in keras in the evaluate method in certain situations …

Here’s my minimal reproducer:

from tensorflow import keras
import numpy as np
loss_func = keras.losses.BinaryCrossentropy()
nn = keras.Sequential([
  keras.layers.Dense(2**8, input_shape=(1,), activation='relu'),
  keras.layers.Dense(2,activation='softmax')
])
nn.compile(loss=loss_func,optimizer='adam')
train_x = np.array([0.4]) # this is an arbitrary input
train_y = np.array([[1.,0.]])
train_q = nn.predict(train_x)
print("train_q = ",train_q)
print("Evaluated loss = ",nn.evaluate(train_x,train_y))
print("Function loss = ",loss_func(train_y,train_q).numpy())
print("Manual loss = ", -np.log(train_q[0,0]) )

Yielding the following output:

train_q =  [[0.5108817  0.48911828]]
1/1 [==============================] - 0s 438ms/step - loss: 0.6823
Evaluated loss =  0.682330846786499
Function loss =  0.671617
Manual loss =  0.67161715

The function loss makes complete sense, it equals the loss value I calculate ‘manually’ (by hand). What doesn’t make sense is the loss calculated in the evaluate call. How did it get 0.68233 here?

Many thanks!
Will

will-cern · September 1, 2022, 11:32am

poking this simple question as still unanswered!

Bhack · September 1, 2022, 3:24pm

Have you tried with nn(train_x) or nn.predict_step(train_x) instead of nn.predict(train_x)?

will-cern · October 6, 2022, 5:18pm

Hi. Thanks for the suggestion. I just tried it but both of those alternative yield the same output as nn.predict … so it still seems like nn.evaluate is doing something wrong here … it’s like there’s a bug in nn.evaluate ???

robert.pollak · October 7, 2022, 5:30am

If you suspect a bug, filing this in the tensorflow issue tracker might make sense.
Which library versions do you use, b.t.w.?

will-cern · October 7, 2022, 8:55am

I did open an issue months ago unexpected value of binary_crossentropy loss function in network with · Issue #56910 · tensorflow/tensorflow · GitHub but at the time the person who responded wasn’t very helpful and basically told me to post in the keras repo …

You can copy-paste my example into google colab and it will reproduce the issue … so that’s tf 2.8.2

Bhack · October 7, 2022, 2:48pm

I can reproduce the same results:


# Seed value
# Apparently you may use different seed values at each stage
seed_value= 0

# 1. Set the `PYTHONHASHSEED` environment variable at a fixed value
import os
os.environ['PYTHONHASHSEED']=str(seed_value)

# 2. Set the `python` built-in pseudo-random generator at a fixed value
import random
random.seed(seed_value)

# 3. Set the `numpy` pseudo-random generator at a fixed value
import numpy as np
np.random.seed(seed_value)

# 4. Set the `tensorflow` pseudo-random generator at a fixed value
import tensorflow as tf
from tensorflow import keras


loss_func = keras.losses.BinaryCrossentropy()
nn = keras.Sequential([
  keras.layers.Dense(2**8, input_shape=(1,), activation='relu', kernel_initializer=keras.initializers.GlorotUniform(seed=1)),
  keras.layers.Dense(2,activation='softmax', kernel_initializer=keras.initializers.GlorotUniform(seed=1))
])
nn.compile(loss=loss_func,optimizer='adam')
train_x = np.array([0.4]) # this is an arbitrary input
train_y = np.array([[1.,0.]])
train_q = nn.predict_step(train_x)
print("train_q = ",train_q)
print("Evaluated loss = ",nn.evaluate(train_x,train_y))
print("Function loss = ",loss_func(train_y,train_q).numpy())