ModularityinReinforcementLearningviaAlgorithmicindependenceinCreditAssignmentMichaelChang1SidhantKaushik1SergeyLevine1ThomasL.Griffiths2Abstract1.introductionManytransferproblemsrequirere-usingprev...
ModellingBehaviouralDiversityforLearninginOpen-EndedGamesNicolasPerezNieves12YaodongYang13OliverSlumbers3DavidHenryMguni1YingWen3JunWang13AbstractdevelopingAIsthatachievesuper-humanperformanceonsop...
OnVariationalinferenceinBiclusteringModelsGuanhuaFang,PingLiCognitiveComputingLabBaiduResearch10900NE8thStBellevueWA98004USA{guanhuafang,liping11}@baidu.comAbstractstudies(Prelic´etal.,2006;GuandL...
OntheProblemofUnderrankinginGroup-FairRankingSruthiGorantla1AmitDeshpande2AnandLouis1Abstractethicalconcernsandcanpotentiallycauselong-termeco-nomicandsocietalharmtodemographicsandbusinessesBiasinr...
MetaLearningforSupportRecoveryinHigh-dimensionalPrecisionMatrixEstimationQianZhang1YilinZheng2JeanHonorio2Abstract1.introductioninthispaper,westudymetalearningforsup-Precision(orinversecovariance)m...
MeasuringRobustnessinDeepLearningBasedCompressiveSensingMohammadZalbagiDarestani1AkshayS.Chaudhari2ReinhardHeckel13Abstractsiderthecompressivesensingproblemarisinginmagneticresonanceimaging(MRI),wh...
Low-PrecisionReinforcementLearning:RunningSoftActor-CriticinHalfPrecisionJohanBjorck1XiangyuChen1ChristopherDeSa1CarlaP.Gomes1KilianQ.Weinberger1Abstractlearning,anemergingtrendforacceleratingdeepl...
LowerBoundsonCross-EntropyLossinthePresenceofTest-timeAdversariesArjunNitinBhagoji1DanielCullina2VikashSehwag3PrateekMittal3Abstractonestablishingfundamentalboundsonlearninginthepres-enceoftest-tim...
LocallyPersistentExplorationinContinuousControlTaskswithSparseRewardsSusanAmin12MaziarGomrokchi12HosseinAboutalebi34HarshSajita12DoinaPrecup12Abstractcallforacleverexplorationstrategythatexposesthe...
LeveragingNon-uniformityinFirst-orderNon-convexOptimizationJinchengMei12YueGao1BoDai2CsabaSzepesva´ri31DaleSchuurmans21Abstractinreinforcementlearning(RL)(Agarwaletal.,2020),super-Classicalglobalc...
LeveragingGoodRepresentationsinLinearContextualBanditsMatteoPapini†1AndreaTirinzoni1MarcelloRestelli1AlessandroLazaric2MatteoPirotta2Abstractrangeofdomains,includingrecommendationsystems,on-Thelin...
LenientRegretandGood-ActionIdentificationinGaussianProcessBanditsXuCai1SelwynGomes1JonathanScarlett12Abstractgorithmscanoftenbeappliedinaunifiedmannerinthesetwosettings.inthispaper,westudytheproble...
LearningWhilePlayinginMean-FieldGames:ConvergenceandOptimalityQiaominXie1ZhuoranYang2ZhaoranWang3AndreeaMinca1Abstractfromthescalabilityissue.Specifically,inamulti-agentsystem,eachagentinteractswit...
LearningtoRehearseinLongSequenceMemorizationZhuZhang12ChangZhou2JianxinMa2ZhijieLin1JingrenZhou2HongxiaYang2ZhouZhao1Abstract2020b),orpredictwhetherauserwillclickthegivenitembasedontheuserbehaviors...
LearningSelf-ModulatingAttentioninContinuousTimeSpacewithApplicationstoSequentialRecommendationChaoChen1HaoyuGeng12NianzuYang12JunchiYan12DaiyueXue3JianpingYu3XiaokangYang12Abstractfadeawayduetomat...
LearninginNonzero-SumStochasticGameswithPotentialsDavidMguni1YutongWu2YaliDu3YaodongYang13ZiyiWang2MinneLi3YingWen4JoelJennings1JunWang3Abstractautonomousvehiclesseekingtoarriveattheirindividualdes...
LearningFairPoliciesinDecentralizedCooperativeMulti-AgentReinforcementLearningMatthieuZimmer1ClaireGlanois1UmerSiddique1PaulWeng12Abstractcurrentmainfocusisontheirperformancewithrespecttothetotal(o...
LearningandPlanninginComplexActionSpacesThomasHubert1JulianSchrittwieser1IoannisAntonoglou1MohammadaminBarekatain1SimonSchmitt1DavidSilver1Abstractreal-worldproblems.Manyimportantreal-worldproblems...
LearningandPlanninginAverage-RewardMarkovDecisionProcessesYiWan1AbhishekNaik1RichardS.Sutton12Abstractwithit.Forlearningandcombinedmethods,bothcontrolandpredictionproblemscanbefurthersubdividedinto...