DecentralizedSingle-TimescaleActorCriticonZero-SumTwo-PlayerStochasticGamesHongyiGuo1ZuyueFu1ZhuoranYang2ZhaoranWang1AbstractasMarkovdecisionprocess(Puterman,2014,MDP),whereanagentaimstolearnanopti...