Custom loss function - "Shapes of all inputs must match"

glestrade · September 19, 2021, 10:22pm

Hi y’all… continuing the saga from my previous post. I’ve included code snippets for your viewing at the bottom. Let me know if you need more!

Currently, I’m trying to build out a GradientTape with just some integers I obtained from a custom loss function. It seems like it’s trying to find the gradient for multiple variables at once, as I had to change the GradientTape to persistent, or I got the following error:

RuntimeError: A non-persistent GradientTape can only be used to compute one set of gradients (or jacobians)

This workaround necessitates that I would manually delete the GradientTape later, which of course I’m not the biggest fan of…

Thanks for reading and take care!

=======================================================

Some output :

model_2: print loss_value_tensor in get grad f’n(48,)

model_2: print x shape in compute loss f’n(48, 28, 28, 1)

model_2: print y shape in compute loss f’n(48,)

model_2: print loss_value shape in compute loss f’n()

… And that will make sense when you see the code.

Error was: Shapes of all inputs must match: values[0].shape = [3,3,1,8] != values[1].shape = [8] [Op:Pack] name: initial_value

=======================================================

Here’s the code snippets I think you’d need to diagnose the problem. The matrices could be filled out with dummy data, since we’re just worried about the numpy array shapes, and the tensor shapes:

gist.github.com

https://gist.github.com/erick016/898ebb309ec9cfaae314aaef8ff0e385

loss_fns

#Training replacement

d = 0 #from algorithm 2, used in computing loss
#alpha = .33 #learning rate

#def nt_grad(curr_d):
    #return -1*alpha*(curr_d/BATCH_SIZE)

def compute_loss(model, x, y, training):
    out = model(x, training=training)

This file has been truncated. show original

gist.github.com

https://gist.github.com/erick016/fbf91402b5b743a9d3f1467d081c165b

bigL.py

    print("Reached Big L.")
    
    for batch_idx in range(NUM_BATCHES):
        #print("epoch_overall_loss_per_batch Index:" + str(epoch * NUM_BATCHES + batch_idx))
        print("batch_idx:" + str(batch_idx))
        #print("train_loss_per_batch_L1 Shape:" + str(np.shape(train_loss_per_batch_L1)))
        #print("train_loss_per_batch_L2 Shape:" + str(np.shape(train_loss_per_batch_L2)))
        #print("train_loss_per_batch_L3 Shape:" + str(np.shape(train_loss_per_batch_L3)))
        #print("epoch_overall_loss_per_batch Shape:" + str(np.shape(train_loss_per_batch_L3)))
        #print("===================================")

This file has been truncated. show original

glestrade · September 20, 2021, 12:15am

Relevant article I’m looking at:

The actual custom optimizer I’m using:

gist.github.com

https://gist.github.com/erick016/30567f54946cf9e2804db2ab10da5dd4

custom_optimizer.py

#!/usr/bin/env python
# coding: utf-8

# In[1]:


#momentum "m_hat" and gradient "g_hat"


# In[2]:

This file has been truncated. show original