Custom loss function error: InvalidArgumentError: The two arguments to a cwise op must have same number of elements

Hello!

I am trying to write my own custom loss function which I can’t get it to work. I describe the problem step by step below and also provide the full code as a colab notebook

I have a toy model that takes (and returns) arrays of two values , like this:

  inp = Input(shape=(2))
  out = Dense(2, activation='sigmoid')(inp)
  model = Model(inp, out)

My custom loss function is this one, that returns an array with as many elements (losses) as samples has the batch (batch size).

For instance if the batch size is 3, the loss function returns an array like:

[0.53182095 1.104414 0.9355842]

def ext_function_FAILS( y_true, y_pred):
  a=y_true.numpy()
  b=y_pred.numpy() 
  loss = np.sum(a,axis=1)
  print(loss)
  return loss

Next, I compile the model by previously wrapping the loss function into tf.fuction:

class LossFunction(object):
  def __init__(self, model,loss_function):
    self.model = model
    self.loss_function = loss_function
  def __call__(self, y_true, y_pred, sample_weight=None):
    error = tf.py_function(self.loss_function, [y_true, y_pred], Tout=tf.float32)
    print('Loss after loss_function')
    print(error)
    return error

def make_model():
  inp = Input(shape=(2))
  out = Dense(2, activation='sigmoid')(inp)
  model = Model(inp, out)
  return model

tf.compat.v1.enable_eager_execution()
model = make_model()

# this is the actual data: three arrays of two elements each
np.random.seed(1)
N_SAMPLES=3
N_ELEMENTS_PER_SAMPLE=2
x = np.random.rand(N_SAMPLES,N_ELEMENTS_PER_SAMPLE)
y = x

model.compile(optimizer = 'adam', loss=LossFunction(model,ext_function_FAILS), run_eagerly=True)

Finally I call fit on the model and that rarises the error below.
model.fit(x,y, epochs=1, verbose=True,batch_size=3)

InvalidArgumentError: The two arguments to a cwise op must have same number of elements, got 6 and 1 [Op:SigmoidGrad]

I understand the ‘6’ in the error somehow comes from 3 samples x 2 elements. However I see the call to tf.function returns a tensor containing three values which I belive is the expected behaviour for a batch of three samples:

tf.Tensor([0.23909448 1.1373465 0.30244696], shape=(3,), dtype=float32)

Now if I compile the model with this other loss function which purely uses the Keras backend, without resorting to numpy arrays, it does work!:

def ext_function_WORKS(y_true, y_pred):
    squared_difference = tf.square(y_true - y_pred)
    loss = tf.reduce_mean(squared_difference, axis=-1)
    print(loss)
    return loss
model.compile(optimizer = 'adam', loss=LossFunction(model,ext_function_WORKS), run_eagerly=True)

model.fit(x,y, epochs=1, verbose=True,batch_size=3)

tf.Tensor([0.17222881 0.04165572 0.13933308], shape=(3,), dtype=float32)
1/1 [==============================] - 0s 73ms/step - loss: 0.1177
<keras.callbacks.History at 0x7f0a27e60c50>

As you can see both loss functions return the same type of tensor, only differing the actual loss values:

tf.Tensor([0.23909448 1.1373465 0.30244696], shape=(3,), dtype=float32)
tf.Tensor([0.17222881 0.04165572 0.13933308], shape=(3,), dtype=float32)

I have been unable to debug why my loss function does not work. Any help will be much appreciated.

Thanks a lot!
Jorge

I also have this error using numpy.
Did you find a solution?