ProvablyEfficientLearningofTransferableRewardsAlbertoMariaMetelli1GiorgiaRamponi1AlessandroConcetti1MarcelloRestelli1Abstracttheoretically,underthestrongassumptionofrewardunique-ness(Abbeel&Ng,2004...
MURAL:Meta-LearningUncertainty-AwareRewardsforOutcome-DrivenReinforcementLearningKevinLi1AbhishekGupta1VitchyrPong1AshwinReddy1AurickZhou1JustinYu1SergeyLevine1AbstractFigure1.MURAL:Ourmethodtrains...
DynamicPlanningandLearningunderRecoveringRewardsDavidSimchi-Levi1ZeyuZheng2FengZhu1Abstractimmediatelydropsafteritispulled,andthengraduallyre-coversifthearmisnotpulledinthesubsequenttimeperiods.Mot...
DetectingRewardsDeteriorationinEpisodicReinforcementLearningIdoGreenberg1ShieMannor12AbstractRLtasksisthesafetyandreliabilityofthesystem(Dulac-Arnoldetal.,2019;Chanetal.,2020),arisinginbothof-Inman...
OptionDiscoveryintheAbsenceofRewardswithManifoldAnalysisAmitayBar1RonenTalmon1RonMeir1Abstractthegraphedgesrepresentthestatesconnectivity.Suchanapproachledtotheintroductionofproto-valuefunctionsOpt...
OptimizingDataUsageviaDifferentiableRewardsXinyiWang1HieuPham12PaulMichel1AntoniosAnastasopoulos1JaimeCarbonell1GrahamNeubig1AbstractPreviousworkhasattemptedtocreatestrategiestohandlethissensitivit...
CollaborativeMachineLearningwithIncentive-AwareModelRewardsRachaelHweeLingSim1YehongZhang1MunChoonChan1BryanKianHsiangLow1Abstractfromotherhospitalsandfirmstoimprovethepredictionofsomediseaseprogre...
DiscoveringandRemovingExogenousStateVariablesandRewardsforReinforcementLearningThomasDietterich1GeorgeTrimponias2ZhitangChen2Abstractchannel.Thishighdegreeofstochasticitycanconfuserein-forcementlea...