Hello y’all

I’m trying to create a text summary ml model by fine-tuning GPT2. And here is my current code

```
import tensorflow as tf
from transformers import GPT2Tokenizer
import keras_nlp
from transformers import TFGPT2LMHeadModel, GPT2Tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = TFGPT2LMHeadModel.from_pretrained("gpt2")
documents= ["Hello, World", "Hello, World", "Hello, World"]
summaries = ["Good Morning", "Good Evening", "Good Day"]
#1. tokenize each string in those two lists
# here we don't need to do process such as padding and truncating as openAI didn't
documents_toknized = list(map(tokenizer, documents))
summaries_toknized = list(map(tokenizer, summaries))
#2. convert those list into Tensotflow tensor format
documents_tensor = ### I'm stuck here
summaries_tensor = ### I'm stuck here
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5)
model.compile(optimizer=optimizer, loss=model.compute_loss)
model.fit(x=documents_tensor, y=summaries_tensor, epochs=1)
```

As written in this tutorial doc, I’m trying to convert `documents_toknized`

and `summaries_toknized`

to `Tensorflow tensor type`

for `x`

and `y`

in `model.fit()`

.

Here what I should put them into `x`

, `y`

in `model.fit()`

.

I know I can convert each one of them like this:

```
input_ids = [item['input_ids'] for item in documents_toknized]
attention_mask = [item['attention_mask'] for item in documents_toknized]
input_ids_tensor = tf.convert_to_tensor(input_ids)
attention_mask_tensor = tf.convert_to_tensor(attention_mask)
```

But I’m wondering should I put just each input_ids_tensor to x and y or other way.

Plus how can I apply batch here

Can anyone help me with this? Thanks