DiscretizationDriftinTwo-playerGamesMihaelaRosca12YanWu1BenoitDherin3DavidG.T.Barrett1Abstractoftwoplayergamesbyfindingcontinuoussystemswhichbettermatchthegradientdescentupdatesusedinpractice.Gradi...
DecentralizedSingle-TimescaleActorCriticonZero-SumTwo-playerStochasticGamesHongyiGuo1ZuyueFu1ZhuoranYang2ZhaoranWang1AbstractasMarkovdecisionprocess(Puterman,2014,MDP),whereanagentaimstolearnanopti...
AdversarialPolicyLearninginTwo-playerCompetitiveGamesWenboGuo1XianWu1SuiHuang2XinyuXing1Abstract2020),wearguethatattacksdevelopedunderthisassump-tionarenotpractical.Forexample,givenamasteragentInat...