AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'to_tensor'

ipietri · October 15, 2021, 1:00pm

Hi everyone,

I’m fine-tunning a transformer model (BERT) using the procedure described here (Fine-tune a pretrained model) using keras/TensorFlow.

The model was running without any problem until yesterday, when it started to throw this error: AttributeError: ‘tensorflow.python.framework.ops.EagerTensor’ object has no attribute ‘to_tensor’.

I’m still able to run the model in a local TensorFlow configuration that I have in my Apple M1 laptop, but I can’t run it in Google Colab/ Amazon SageMaker, etc. The model was running just fine in Google Colab until yesterday when it started trowing the error.

I wonder what can be happening. I really appreciate if someone could shed some light.

Thanks,
Isabel

#help_request
#keras

lgusm · October 15, 2021, 1:11pm

Sorry for the issue.

I don’t know exactly what’s wrong but to help debug the issue it would be nice to find out what object is that and find the version of the package that it’s defined. I’d guess there was a package update on the cloud envs (colab and sagemaker) and not in your local env. That would explain why it works locally and not on the cloud

ipietri · October 15, 2021, 2:03pm

Thanks. The code below reproduces the error.

!pip install transformers
!pip install datasets

import pandas as pd
import numpy as np
import tensorflow as tf
from transformers import AutoTokenizer
from datasets import Dataset

# dummy sentences
sentences = ['the house is blue and big', 'this is fun stuff','what a horrible thing to say']

# create a pandas dataframe and converto to Hugging Face dataset
df = pd.DataFrame({'Text': sentences})
dataset = Dataset.from_pandas(df)

#download bert tokenizer
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

# tokenize each sentence in dataset
dataset_tok = dataset.map(lambda x: tokenizer(x['Text'], truncation=True, padding=True, max_length=10), batched=True)

# remove original text column and set format
dataset_tok = dataset_tok.remove_columns(['Text']).with_format('tensorflow')

# extract features
features = {x: dataset_tok[x].to_tensor() for x in tokenizer.model_input_names}

ipietri · October 15, 2021, 3:29pm

The code is working now. I removed to_tensor() and is running fine. I see now that the statement was redundant. Although that is the procedure suggested in the Hugging Face official documentation (Fine-tune a pretrained model) and TensorFlow wasn’t throwing an error until now. Anyway, I’m glad that is running now.

This solution was suggested here: AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'to_tensor' - Stack Overflow