Top-keXtremeContextualBanditswithArmHierarchyRajatSen1AlexanderRakhlin23LexingYing43RahulKidambi3DeanFoster3DanielHill3InderjitS.Dhillon53Abstract1.IntroductionMotivatedbymodernapplications,suchaso...
StochasticMulti-ArmedBanditswithUnrestrictedDelayDistributionsTalLancewicki1ShaharSegal1TomerKoren12YishayMansour12Abstracttion,likeintheclassicstochasticMABproblem.However,WestudythestochasticMult...
RobustPureExplorationinLinearBanditswithLimitedBudgetAyyaAlieva1AshokCutkosky2AbhimanyuDas3Abstracttheexplorationphaseshouldbesomehowefficient-wewishtomakethebestuseofourlimitedbudgetinordertoWecon...
QuantileBanditsforBestArmsIdentificationMengyanZhang12ChengSoonOng21AbstractMean0.5-QuantileMean0.8-QuantileWeconsideravariantofthebestarmidentifica-A3.503.50C1.452.33tiontaskinstochasticmulti-arme...
OptimalStreamingAlgorithmsforMulti-ArmedBanditsTianyuanJin1KekeHuang1JingTang2XiaokuiXiao1Abstractson,1933),onlineadvertisement(Bertsimas&Mersereau,2007),andcrowdsourcing(Zhouetal.,2014).Ittypicall...
OnlineLimitedMemoryNeural-LinearBanditswithLikelihoodMatchingOfirNabati1TomZahavy12ShieMannor13Abstractplorationduringtherepresentationlearningphaseisstillanopenproblem.The-greedypolicy(Langford&Zh...
OnLimited-MemorySubsamplingStrategiesforBanditsDorianBaudry1YoanRussac2OlivierCappé2AbstractMulti-armedBanditsmodelshavebeenusedtoaddressawiderangeofsequentialoptimizationtasksunderuncer-Therehasb...
OfflineContextualBanditswithOverparameterizedModelsDavidBrandfonbrener1WilliamF.Whitney1RajeshRanganath1JoanBruna1AbstractIncontrast,thebestperformanceinmodernsupervisedlearningisoftenachievedbymas...
Near-OptimalRepresentationLearningforLinearBanditsandLinearRLJiachenHu1XiaoyuChen1ChiJin2LihongLi3LiweiWang14AbstractWhilerepresentationlearninghasachievedtremendoussuc-cessinavarietyofapplications...
LeveragingGoodRepresentationsinLinearContextualBanditsMatteoPapini†1AndreaTirinzoni1MarcelloRestelli1AlessandroLazaric2MatteoPirotta2Abstractrangeofdomains,includingrecommendationsystems,on-Thelin...
ImprovedRegretBoundsofBilinearBanditsusingActionSpaceAnalysisKyoungseokJang1Kwang-SungJun2Se-YoungYun3WanmoKang1Abstractarrangecouplesbasedontheirexperiencestogetbetterrat-ingsandrewards.Balancinge...
High-DimensionalExperimentalDesignandKernelBanditsRomainCamilleri1JulianKatz-Samuels2KevinJamieson1AbstractWeconsiderthefollowinggamebetweenalearnerandna-ture:ateachtimet=1...T,thelearnerrequestsxt...
FairnessofExposureinStochasticBanditsLequnWang1YiweiBai1WenSun1ThorstenJoachims1AbstractWhilemaximizinguserresponsesmayarguablybeintheinterestoftheplatformanditsusersatleastintheshortterm,Contextua...
DynamicBalancingforModelSelectioninBanditsandRLAshokCutkosky1ChristophDann2AbhimanyuDas3ClaudioGentile2AldoPacchiano4ManishPurohit3Abstractsumptionsabouttheclassofpolicies,thesourcegeneratingreward...
CombinatorialBlockingBanditswithStochasticDelaysAlexiaAtsidakou1OrestisPapadigenopoulos2SoumyaBasu3ConstantineCaramanis1SanjayShakkottai1AbstractCella&Cesa-Bianchi,2019).Thesevariantscaptureappli-c...
BestArmIdentificationinGraphicalBilinearBanditsGeovaniRizk12AlbertThomas2IgorColin2RidaLaraki13YannChevaleyre1Abstractagent(e.g.,alltheconfigurationparametersoftheantennas),andreceivesanassociatedg...
Bias-RobustBayesianOptimizationviaDuelingBanditsJohannesKirschner1AndreasKrause1AbstractWestudyasettingwherethelearner’sobjectiveistomax-imizeanunknownfunctionf:X→RwithadditiveWeconsiderBayesiano...
Beyondlog2(T)RegretforDecentralizedBanditsinMatchingMarketsSoumyaBasu1KarthikAbinavSankararaman2AbishekSankararaman3AbstractBanditsisdedicatedtounderstandingalgorithmicprinciplesintheinterplayofcom...
ApproximationTheoryBasedMethodsforRKHSBanditsShoTakemori1MasahiroSato1AbstracttheadversarialRKHSbanditproblem,wherealearnerinter-actswithasequenceofanyfunctionsfromtheRKHSwithTheRKHSbanditproblem(a...
AnAlgorithmforStochasticandAdversarialBanditswithSwitchingCostsChloe´Rouyer1YevgenySeldin1Nicolo`Cesa-Bianchi2Abstractanarmdifferentfromtheoneplayedinthepreviousround.Suchswitchingcostmayoccurinth...