Masked Autoencoders are now available in πŸ€— transformers in TensorFlow!

@ariG23498 and I contributed the MAE model [1] in TensorFlow to transformers.

Check it out here:

from transformers import AutoFeatureExtractor, TFViTMAEForPreTraining
from PIL import Image
import requests

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/vit-mae-base")
model = TFViTMAEForPreTraining.from_pretrained("facebook/vit-mae-base")

inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
loss = outputs.loss
mask = outputs.mask
ids_restore = outputs.ids_restore

Additionally, we encourage the interested folks to check out the corresponding PR to get a sense of what went into ensuring the components of the model were tested well enough.

[1] Masked Autoencoders Are Scalable Vision Learners

2 Likes