Semi-supervision and domain adaptation with AdaMatch

In semi-supervised learning (SSL), we use a small amount of labeled data to train models on a bigger unlabeled dataset. Quite common in practice sometimes.

In unsupervised domain adaptation (UDA), we have access to a source labeled dataset and a target unlabeled dataset. Then the task is to learn a model that can generalize well to the target dataset. Again, quite practical stuff.

In my latest example on, I present an implementation and walkthrough of AdaMatch which beautifully unifies SSL and UDA. I also introduced a couple of preliminaries to make it easier for folks who are not familiar with relevant concepts. Expect plots, figures, code, and illustrative code comments.


It’s a excelent implementation.
Can you to do an implementation of AdaMatch for SSL ?
I was reading your implementation on but I would like to addapt it in a SSL scenarios.

Thanks so much for your work here. I notice you stop_gradients on the target pseudo labels. The comments say that this is standard practice, but as I’ve been reading the literature (such as the FixMatch and pi-model papers), I haven’t seen it implemented. In SimSiam yes, but that’s got two different networks, right? Anyhow, I’m trying to figure out how the network can still learn from the unsupervised data if the gradients are stopped. I understand that it’s easy to collapse to a degenerate solution with those gradients turned on, but yeah…how does it learn from that data without getting gradients?

I dug some more into the code and I think I figured it out. You want to stop gradients when making a prediction that creates the label, but you keep the gradients (and thus the learning) when you are making a prediction to compare against that pseudo label.