How does matmul work?

Sasha_Ama · August 29, 2022, 3:19am

Hello. I am new to TensorFlow and decided to explore how matmul() works. I’ve started with something very simple:
q = tf.Variable([-1,-3])
r = tf.Variable([4,-2])
print(f"q={q}")
print(f"r={r}")
res = tf.matmul(q, r)
print(f"q X r = {res}")
According to what I remember from the college, the result should be [[-4,2],[-12,6]], but I’ve got exception:
tensorflow.python.framework.errors_impl.InvalidArgumentError: In[0] and In[1] ndims must be == 2: 1 [Op:MatMul]
Could you please explain what I am doing wrong?

Robert_Pope · August 29, 2022, 8:05pm

Both of those variables are the same dimension: 1 row and 2 columns. (To be honest, I’m not sure if q is a matrix or a list. Maybe it needs to be [[-1,-3]])

So you are trying to do:
Matrix dimension 1x2 times Matrix dimension 1x2

For valid matrix multiplication, the dimensions closest to each other have to match. But you have 2 columns in q trying to coordinate with 1 row in r.

The dimensions furthest apart give the the dimensions of the matmul result. So, for your example calculation, q should be 2 rows and 1 column, not 1 row and 2 columns:
2x1 matmul 1x2 => 2x2 matrix

Sasha_Ama · August 29, 2022, 10:04pm

That is great! I’ve also suspected that I needed to specify that q and r should have different dimensions, but could not find the proper way to do so. After your message I’ve done more experiments and found out what should be done:
q = tf.Variable([[-1],[-3]])
r = tf.Variable([[4,-2]])
which prints:
q X r = [[ -4 2]
[-12 6]]
exactly what I’d expected.
Thanks a lot!