NLP translator data preparation

I am new to this forum and unsure if anyone is willing to help, but I want to give it a try.

My goal is a translator from german to english and vise versa. I have a large dataset of 152000 samples (words and sentences).

I want to use a sequence length of 60. My input vocabSize is 30944 and my output vocabSize is 14849.

If I tokenize and pad my sequences I get something like [[11,0,44,2322,23111,…],[444,22,11113,4456,…]]. It would be easy to create an input tensor with tf.tensor2d(paddedInputs); The shape would be [152000,60].

But I read that a NLP translator model needs a 3d tensor of shape [batchSize, sequenceLength, vocabSize].

2 Questions:

  1. Is it really true that the tensor has to be 3d instead of 2d? Why?
  2. How do I create the 3d tensor with my 2d paddedInput?