Multi-TaskReinforcementLearningwithContext-basedRepresentationsShagunSodhani1AmyZhang123JoellePineau123AbstractReinforcementlearning(MTRL)isapromisingapproachtotraineffectivereal-worldagents(Tanaka...
ModularityinReinforcementLearningviaAlgorithmicIndependenceinCreditAssignmentMichaelChang1SidhantKaushik1SergeyLevine1ThomasL.Griffiths2Abstract1.IntroductionManytransferproblemsrequirere-usingprev...
Model-FreeReinforcementLearning:fromClippedPseudo-RegrettoSampleComplexityZihanZhang1YuanZhou2XiangyangJi1AbstractInRLtheory,model-freealgorithmsareexplicitlydefinedtobetheoneswhosespacecomplexityi...
Model-BasedReinforcementLearningviaLatent-SpaceCollocationOlehRybkin1ChuningZhu1AnushaNagabandi2KostasDaniilidis1IgorMordatch3SergeyLevine4AbstractLatCooptimizationoverlatentstatesTheabilitytoplani...
Model-basedReinforcementLearningforContinuousControlwithPosteriorSamplingYingFan1YifeiMing1AbstractinRLhasbeenoneofthemainchallenges:theagentisexpectedtobalancebetweenexploringunseenstate-actionBal...
MetaCURE:MetaReinforcementLearningwithEmpowerment-DrivenExplorationJinZhang1JianhaoWang1HaoHu1TongChen1YingfengChen2ChangjieFan2ChongjieZhang1Abstractwithsparserewardsremainschallenging,astask-rele...
Low-PrecisionReinforcementLearning:RunningSoftActor-CriticinHalfPrecisionJohanBjorck1XiangyuChen1ChristopherDeSa1CarlaP.Gomes1KilianQ.Weinberger1Abstractlearning,anemergingtrendforacceleratingdeepl...
LogarithmicRegretforReinforcementLearningwithLinearFunctionApproximationJiafanHe1DongruoZhou1QuanquanGu1AbstractAcommonapproachtocopewithhigh-dimensionalstateandactionspacesistoutilizefunctionappro...
LearningRoutinesforEffectiveOff-PolicyReinforcementLearningEdoardoCetin1OyaCeliktutan1Abstractengineeringandareoftenquiteinfluentialontheperfor-mance(Mahmoodetal.,2018).AlgorithmsthatlearnalsoThepe...
InverseConstrainedReinforcementLearningShehryarMalik1UsmanAnwar1AlirezaAghasi2AliAhmed1Abstract(a)Expertpolicy(b)Nominalpolicy(c)RecoveredpolicyInrealworldsettings,numerousconstraintsareFigure1.Asi...
ImprovedCorruptionRobustAlgorithmsforEpisodicReinforcementLearningYifangChen1SimonS.Du1KevinJamieson1Abstractstageaccordingtotheunderlyingtransitionfunction.WestudyepisodicReinforcementlearningunde...
HighConfidenceGeneralizationforReinforcementLearningJamesE.Kostas1YashChandak1ScottM.Jordan1GeorgiosTheocharous2PhilipS.Thomas1AbstractformanceonMDPsdrawnfromthedistribution,includingMDPsnotinthetr...
Goal-ConditionedReinforcementLearningwithImaginedSubgoalsElliotChane-Sane1CordeliaSchmid1IvanLaptev1AbstractFigure1.IllustrationoftheKL-regularizedpolicylearningusingimaginedsubgoals.(Left):Thepoli...
GeneralizableEpisodicMemoryforDeepReinforcementLearningHaoHu1JianingYe2GuangxiangZhu1ZhizhouRen3ChongjieZhang1AbstractDiscreteEpisodicMemoryEpisodicmemory-basedmethodscanrapidlyKeyValuelatchontopas...
Kernel-BasedReinforcementLearning:AFinite-TimeAnalysisOmarD.Domingues12PierreMe´nard3MatteoPirotta4EmilieKaufmann15MichalValko156Abstractinthefaceofuncertainty(OFU,Jakschetal.2010)andThompsonSampl...
ExponentialLowerBoundsforBatchReinforcementLearning:BatchRLcanbeExponentiallyHarderthanOnlineRLAndreaZanette1AbstractweconsidertwoclassicalbatchRLproblems:1)theoff-policyevaluation(OPE)problem,wher...
EmphaticAlgorithmsforDeepReinforcementLearningRayJiang1TomZahavy1ZhongwenXu1AdamWhite12MatteoHessel1CharlesBlundell1HadovanHasselt1AbstractManyReinforcementlearning(RL)agentslearnoff-policytosomeex...
EmergentSocialLearningviaMulti-agentReinforcementLearningKamalNdousse1DouglasEck2SergeyLevine23NatashaJaques23AbstractHumansareabletolearnfromoneanotherwithoutdirectaccesstotheexperiencesormemories...
EfficientPerformanceBoundsforPrimal-DualReinforcementLearningfromDemonstrationsAngelikiKamoutsi1GoranBanjac1JohnLygeros1AbstractInthestandardRLsettingacostsignalisgiventoinstructagentshowtocomplete...
DiscoveringsymbolicpolicieswithdeepReinforcementlearningMikelLandajuela1BrendenK.Petersen1SookyungKim1ClaudioP.Santiago1RubenGlatt1T.NathanMundhenk1JacobF.Pettit1DanielM.Faissol1AbstractFigure1:Alg...