Semi-supervision and domain adaptation with AdaMatch

Sayak_Paul · June 23, 2021, 1:54am

In semi-supervised learning (SSL), we use a small amount of labeled data to train models on a bigger unlabeled dataset. Quite common in practice sometimes.

In unsupervised domain adaptation (UDA), we have access to a source labeled dataset and a target unlabeled dataset. Then the task is to learn a model that can generalize well to the target dataset. Again, quite practical stuff.

In my latest example on keras.io, I present an implementation and walkthrough of AdaMatch which beautifully unifies SSL and UDA. I also introduced a couple of preliminaries to make it easier for folks who are not familiar with relevant concepts. Expect plots, figures, code, and illustrative code comments.

Manuel_Alejandro_Goy · December 1, 2022, 2:00am

Congratulations!!
It’s a excelent implementation.
Can you to do an implementation of AdaMatch for SSL ?
I was reading your implementation on keras.io but I would like to addapt it in a SSL scenarios.

Stephen_Filios · December 5, 2022, 3:55am

Thanks so much for your work here. I notice you stop_gradients on the target pseudo labels. The comments say that this is standard practice, but as I’ve been reading the literature (such as the FixMatch and pi-model papers), I haven’t seen it implemented. In SimSiam yes, but that’s got two different networks, right? Anyhow, I’m trying to figure out how the network can still learn from the unsupervised data if the gradients are stopped. I understand that it’s easy to collapse to a degenerate solution with those gradients turned on, but yeah…how does it learn from that data without getting gradients?

Stephen_Filios · January 25, 2023, 9:02pm

I dug some more into the code and I think I figured it out. You want to stop gradients when making a prediction that creates the label, but you keep the gradients (and thus the learning) when you are making a prediction to compare against that pseudo label.