Here goes my next Keras Example all about implementing Swin Transformers, a general-purpose backbone for computer vision. The Swin Transformer architecture for image classification – a Transformer-based vision model that uses local self-attention as a way to make self-attention on images linear in complexity. I go on to demonstrate using this for image classification on CIFAR-100.
PS: After your wonderful suggestion, @lgusm , I am already in works to publish the trained model on TF Hub.
That’s great!!! thanks!