Tf_agents episodic replay buffers

Hello everyone, this is my first post!
I am trying out the DQN algorithm on the FrozenLake environment as a DRL exercise. I am having issues with the replay buffer and the DQN agent.
Because the rewards are sparse (1 only if landing on the goal state, 0 otherwise), I need to retrieve the Monte-Carlo return and have to learn on a whole episode.
However, the TFUniformReplayBuffer does not support this, it samples steps instead of episodes. I tried using the ReverbBuffer as well but the DQN agent train method does not allow for variable length episodes.
Is there a way to train on episodic data or is it a bad idea to use a DQN, since I have to use the MC return in any case?