Changing action_spec

JM_the_great · July 27, 2021, 11:42am

I’d like to program an environment that has a changing action_spec, related to a game that swaps between 2 alternating phases. The decision to be made is quite different. How would one go about that?

JM_the_great · July 28, 2021, 7:10am

Some additional info and my own thoughts on that:

observation returns 0 or 1 to signify the phase, and in phase 1 the dice result
if phase is 0, action 0 is “pass and take your share”, action 1 is “go on”
if phase is 1, action 0-13 is interpreted as a corresponding action in that game

I could easily treat action 2-13 as “no-action” in phase 0, or all of 1-13 as “go on”. But I expect that would make convergence much slower and more unlikely, since the DQN would have to learn these additional unnecessary relations.

JM_the_great · July 28, 2021, 6:17pm

I realized now, my question is not necessary for this game. I can just roll the dice and present them as observation, ask for 0-13 possible actions to do with them, having deterministic consequences AND as 15th action, if there is “pass” or “go on” afterwards.

Nevertheless, there definitely are games out there which have different, (repeating) phases with stochastic elements between them like dice, or interaction of other players, so the actions for different phases cannot simply be put together in one vector like that. So the question remains to be answered, even though I can continue my project now.