`key_dim` in multihead attention layer

Hey all,

I am looking at the documentation of MultiHeadAttention layer. I do not really understand the use of the key_dim parameter.

In the doc it says:

key_dim: Size of each attention head for query and key.

Thanks in advance :slight_smile: