Is the SAC agent in tensorflow up to date?

Janis_Taranda · November 16, 2021, 2:40am

I found this:

…there is a new version of the algorithm that uses only a Q function and disposes of the V function. It also adds automatic discovery of the weight of the entropy term called the ‘temperature’

By looking at the current code, I do not see this temperature parameter, or is it just named different?

Reference:

https://arxiv.org/abs/1812.05905

Sergio_Guadarrama · November 16, 2021, 10:54pm

Hi Janis, those are two different algorithms, CQL-SAC and SAC.

CqlSacAgent implements the CQL algorithm for continuous control domains from “Conservative Q-Learning for Offline Reinforcement Learning” (Kumar, 20).

The SACAgent uses target_entropy described in the paper and the Automating Entropy Adjustment described in the paper.