DissectingAdam:TheSign,MagnitudeandVarianceofStochasticGradientsLukasBalles1PhilippHennig1AbstractwhichisarandomvariablewithE[g(θ)]=∇L(θ).Animportantquantityforthispaperwillbethe(element-wise)Th...
AUnifiedVarianceReduction-BasedFrameworkforNonconvexLow-RankMatrixRecoveryLingxiaoWang1XiaoZhang1QuanquanGu1Abstractlaxationbasedoptimization(Srebroetal.,2004;Cande`s&Tao,2010;Rohdeetal.,2011;Recht...
StochasticVarianceReductionMethodsforPolicyEvaluationSimonS.Du1JianshuChen2LihongLi2LinXiao2DengyongZhou2Abstractimportantinformationfortheagenttooptimizeitspolicy.Forexample,policy-iterationalgori...
EvaluatingtheVarianceofLikelihood-RatioGradientEstimatorsSeiyaTokui12IsseiSato32AbstractforeVariancereductioniscrucialforpracticallearning.However,fewthingsareknownaboutitstheoreticalas-Thelikeliho...
Averaged-DQN:VarianceReductionandStabilizationforDeepReinforcementLearningOronAnschel1NirBaram1NahumShimkin1Abstractforproblem-specificstaterepresentation.Theseproblem-specificfeaturesdiminishtheag...