Balance datasets for multi-class semantic segmentation

Hi there,
in natural datasets for semantic segmentation with multiple classes, a frequent problem is the uneven distribution of class frequencies. I wondered what the best approach would be, to balance those datasets.
Currently, I am using the option to feed a third input to model.fit() where I provide a map of weights of similar shape as the training mask.
However, I wondered whether it would also be an option to duplicate images in the data set that contain minority classes. Here, I would like to know what you think about such an approach in general; but also, I would be interested in methods to implement this efficiently (given you think it makes any sense at all to do so). Would you make some matrix containing each image and the respective class frequencies and then do some calculation how many times each individual image would have to be replicated? And would you have to replicate the image and save all replicas to disc or is there some other way?

The approach is demonstrated for training on the sparse ground truth of a heterogeneous labeled dataset, training within a transfer learning.