About Multimodal Fusion in deep learning

HI,
I am new to this forum. Greetings to you all.
I have many doubts regarding the fusion of multimodal datasets.
Any people working onthis approach?