How to restoring picture after delete object

The idea came up to make a neural network to remove text from images (manga, comics).

But, unfortunately, I couldn’t find any examples online for generating a part of an image where the rest of the background and the mask of a remote object serve as additional content for the neural network.

Are there any examples (preferably with a detailed explanation) of what architecture can be used?

I looked towards GAN but it only generates a full picture

If your use case is to remove text from images then this video could help you.

Please let me know if this solved your issue