I have a cluster of 2 computers (the first has a RTX 2080 Ti and the second a RTX 3060). I can train a model using tf.distribute.MultiWorkerMirroredStrategy, but, contrary to what I expected, training on one GPU can be faster and, most most importantly, the batch does not seem to be split between the two GPUs, because, with a certain batch size, I can train using a single GPU, but I get an error when I use both GPUs: does anyone know why?