How to preserve html structure when processing

Hi I’m trying to figure out how I can process info that I’ve scraped through language models while retaining the html tags. I am able to get some pretty good results with pegasus-paraphraser but i’d like to figure out how to bulk process pages. Whenever i use tags in the input they get removed in the output of the models. Does anyone have a solution to my issue?

Do you want to directly control the text generation? As it is not a so easy task…

