Momentum-BasedPolicyGradientMethodsFeihuHuang1ShangqianGao1JianPei2HengHuang13Abstracttimesteps,andthenmaximizesthelong-termcumulativerewardstoobtainanoptimalpolicy.Duetoeasyimple-Inthepaper,weprop...