Why is input embedding(x) multiplied by sqrt(d_model) in decoder class of Transformer

Abhishek_Kishore · May 16, 2022, 2:29am

I am going through the code of Transformer model - [here] .

I noticed that in the call method of Decoder class the input encoding is multiplied by the square root of d_model. There is no explanation given for this step. Can someone please explain why this is done in the Decoder class.