TighteningtheDependenceonHorizonintheSampleComplexityofQ-LearningGenLi1ChangxiaoCai2YuxinChen2YuantaoGu1YutingWei3YuejieChi4AbstractQ-learning(Borkar&Meyn,2000;Jaakkolaetal.,1994;Szepesva´ri,1998;...
TighteningExplorationinUpperConfidenceReinforcementLearningHippolyteBourel1Odalric-AmbrymMaillard1MohammadSadeghTalebi2Abstract1.IntroductionTheupperconfidencereinforcementlearningInthispaper,wecon...