Difference between MultiWorkerMirroredStrategy and ParameterServerStrategy

Saish · August 14, 2021, 1:36pm

Distributed Training

Of the strategies used in distributed training, among MultiWorkerMirroredStrategy and the ParameterMirroredStrategy when using the former one, we the workers have access to direct dataset and undergo synchronous training of variables and the ParameterServerStrategy has a parameter server which holds the variables right?.

So does that mean the when implementing ParameterServerStrategy, the dataset can be stored at where the parameter server is instantiated say a server with high storage space and dispatch training calls to workers on another server (say nvidia dgx) without workers having direct access to the dataset?

The tutorials, inside tensorflow series and the coursera coures (Laurence Moroney Custom and Distributed Training) does’nt seem to have end to end working demo of ParameterServerStrategy code to test across multiple devices.
Please correct me if I am wrong anywhere…

including a disucssion of a stackoverflow thread: tensorflow2.0 - When is TensorFlow's ParameterServerStrategy preferable to its MultiWorkerMirroredStrategy? - Stack Overflow

Renu_Patel · November 28, 2023, 6:18pm

Hi @Saish

Welcome to the TensorFlow Forum!

Please have a look at these tutorials with the end to end example code on MultiWorkerMirroredStrategy and ParameterServerStrategy. Thank you.