Kernel-BasedReinforcementLearninginRobustMarkovDecisionProcessesShiauHongLim1ArnaudAutef2AbstractThisclassincludeskernelaveraging,k-nearest-neighbor,weightedk-nearestneighbor,Bezierpatches,linearin...
Information-TheoreticConsiderationsinBatchReinforcementLearningJinglinChen1NanJiang1AbstractwhentheyworkiscentraltoourunderstandingofRL.Ex-istingworksthatanalyzeerrorpropagationandfinitesam-Value-f...
Grid-WiseControlforMulti-AgentReinforcementLearninginVideoGameAILeiHan1PengSun1YaliDu23JiechaoXiong1QingWang1XinghaiSun1HanLiu4TongZhang5Abstractetal.,2016),etc.Amongthese,RLingameAIresearchattract...
GenerativeAdversarialUserModelforReinforcementLearningBasedRecommendationSystemXinshiChen1†ShuangLi2HuiLi4ShaohuaJiang4YuanQi4LeSong34Abstractexplicitlytakeintoaccountthelong-termuserinterest.How-...
FingerprintPolicyOptimisationforRobustReinforcementLearningSupratikPaul1MichaelA.Osborne2ShimonWhiteson1Abstractacrossallpossiblesettings.Fortunately,policiescanoftenbetrainedandtestedinasimulatort...
ExtrapolatingBeyondSuboptimalDemonstrationsviaInverseReinforcementLearningfromObservationsDanielS.Brown1WonjoonGoo1PrabhatNagarajan2ScottNiekum1AbstractFigure1.T-REXtakesasequenceofrankeddemonstrat...
ExplorationConsciousReinforcementLearningRevisitedLiorShani1YonathanEfroni1ShieMannor1AbstractRL,i.e,whenusingfunctionapproximation,remainsanopenproblem.Onthepracticalside,recentworkscom-TheExplora...
DynamicWeightsinMulti-ObjectiveDeepReinforcementLearningAxelAbels12DiederikM.Roijers3TomLenaerts12AnnNowe´2DenisSteckelmacher2Abstractasalinearscalarizationwithweightsperobjectivethatareknowninadv...
DistributionalReinforcementLearningforEfficientExplorationBorislavMavrin12HengshuaiYao3LinglongKong12KaiwenWu4YaoliangYu4AbstractDeterministicenvironmentIndistributionalReinforcementlearning(RL),th...
Dead-endsandSecureExplorationinReinforcementLearningMehdiFatemi1ShikharSharma1HarmvanSeijen1SamiraEbrahimiKahou2Abstracthastointeractwiththeenvironmentandlearnfromitsexpe-rience.Therealwaysexistsa(...
CURIOUS:IntrinsicallyMotivatedModularMulti-GoalReinforcementLearningCe´dricColas1PierreFournier2OlivierSigaud2MohamedChetouani2Pierre-YvesOudeyer1AbstractFigure1.TheModularGoalFetchArmenvironment....
ControlRegularizationforReducedVarianceReinforcementLearningRichardCheng1AbhinavVerma2Ga´borOrosz3SwaratChaudhuri2YisongYue1JoelW.Burdick1Abstractrithmsfocusonmaximizingthelong-termrewardthroughtr...
ComposingValueFunctionsinReinforcementLearningBenjaminvanNiekerk1StevenJames1AdamEarle1BenjaminRosman12Abstractpreviousabilities.Animportantpropertyforlifelong-learningInReinforcementlearning(RL),o...
CollaborativeEvolutionaryReinforcementLearningShauhardaKhadka12SomdebMajumdar1TarekNassar1ZachDwiel1EvrenTumer1SantiagoMiret1YinyinLiu1KaganTumer2Abstracttoroboticcontrol(Andrychowiczetal.,2017;Lil...
CalibratedModel-BasedDeepReinforcementLearningAliMalik1VolodymyrKuleshov12JiamingSong1DannyNemer2HarlanSeymour2StefanoErmon1AbstractFigure1.Modernmodel-basedplanningalgorithmswithproba-bilisticmode...
BayesianActionDecoderforDeepMulti-AgentReinforcementLearningJakobN.Foerster12H.FrancisSong3EdwardHughes3NeilBurch3IainDunning3ShimonWhiteson1MatthewM.Botvinick3MichaelBowling3Abstract1.Introduction...
Actor-Attention-CriticforMulti-AgentReinforcementLearningShariqIqbal1FeiSha12Abstractmentlearninghavebeendeveloped.Thesimplestapproachistotraineachagentindependentlytomaximizetheirin-Reinforcementl...
ActionRobustReinforcementLearningandApplicationsinContinuousControlChenTessler1YonathanEfroni1ShieMannor1AbstractTheadvantageofrobustpoliciesishighlightedwhencon-sideringimperfectmodels,acommonscen...
ADeepReinforcementLearningPerspectiveonInternetCongestionControlNathanJay1NogaH.Rotman2P.BrightenGodfrey1MichaelSchapira2AvivTamar3AbstractFigure1:MultipletrafficflowssharingalinkWepresentandinvest...
TransferinDeepReinforcementLearningUsingSuccessorFeaturesandGeneralisedPolicyImprovementAndre´Barreto1DianaBorsa1JohnQuan1TomSchaul1DavidSilver1MatteoHessel1DanielMankowitz1AugustinZˇ´ıdek1Re´...