Vision Transformers are Robust Learners

Hi folks,

I wanted to share my new work with Pin-Yu Chen (IBM Research) - “Vision Transformers are Robust Learners” .

For some time now, Transformers have taken the vision world by storm. In this work, we question the robustness aspects of Vision Transformers. Specifically, we investigate the question:

With the virtue of self-attention, can Vision Transformers provide improved robustness to common corruptions, perturbations, etc.? If so, why?

We build on top of existing works & investigate the robustness aspects of ViT. Through a series of six systematically designed experiments, we present analyses that provide both quantitative & qualitative indications to explain why ViTs are indeed more robust learners.


Congratulations - amazing work on ViT research.