BestArmIdentificationforCascadingBanditsintheFixedConfidenceSettingZixinZhong1WangChiCheung23VincentY.F.Tan134Abstractnextoneotherwise.Thisprocessstopswhensheclicksononeiteminthelistorifnoitemiscli...
BanditswithAdversarialScalingThodorisLykouris1VahabMirrokni2RenatoPaesLeme2AbstractModelingthisproblem,onesoonrealizesthatthetwoclas-sicalmulti-armedbanditapproachesfailtocapturetheWestudyadversari...
TargetTrackingforContextualBandits:ApplicationtoDemandSideManagementMargauxBre´ge`re123PierreGaillard3YannigGoude12GillesStoltz2Abstractblebythedevelopmentofenergystoragedevicessuchasbatteriesorev...
OptimalAlgorithmsforLipschitzBanditswithHeavy-tailedRewardsShiyinLu1GuanghuiWang1YaoHu2LijunZhang1Abstractfromafixedbutunknownprobabilitydistributionassociatedwiththechosenarm.Inordertomaximizehisg...
DecentralizedExplorationinMulti-ArmedBanditsRaphaëlFéraud1RédaAlami1RomainLaroche2Abstractviceisconnectingtotheapplication,theapplicationpresentsanoptiontotheuserofthedevice.TheaimistomaximizeWe...
DataPoisoningAttacksonStochasticBanditsFangLiu1NessShroff12Abstractismotivatedbymodernindustrialscaleapplicationsofma-chinelearningsystems,wheredatacollectionandpolicyStochasticmulti-armedBanditsfo...
CorrelatedBanditsor:Howtominimizemean-squarederroronlineVinayPraneethBoda1PrashanthL.A.2Abstractbanditproblem,thisobjectiveinvolvesanestimationofthecorrelationstructureamongthevariousarms.Thisismo-...
BilinearBanditswithLow-rankStructureKwang-SungJun1RebeccaWillett2StephenWright3RobertNowak3Abstractsystemmaywanttochooseapairofitems(top,bottom)foracustomer,whoseappealdependsinpartonwhethertheyWei...
Warm-startingContextualBandits:RobustlyCombiningSupervisedandBanditFeedbackChichengZhang1AlekhAgarwal1HalDauméIII12JohnLangford1SahandNNegahban3Abstractensuringthatsuchasystemdoesnotneedtosufferto...
SemiparametricContextualBanditsAkshayKrishnamurthy1ZhiweiStevenWu1VasilisSyrgkanis2Abstractthatmakegeneralreinforcementlearningchallenging.Con-textualbanditalgorithmshaveseenrecentsuccessinappli-Th...
PracticalContextualBanditswithRegressionOraclesDylanJ.Foster1AlekhAgarwal2MiroslavDud´ık2HaipengLuo3RobertE.Schapire2Abstractagnosticinthesensethattheyareprovablyeffectiveforanygivenpolicyclassan...
FiringBandits:OptimizingCrowdfundingLalitJain1KevinJamieson1Abstractcontrolsthefiring.Recentyearshaveseenahugeprolifera-tionofcrowdfundingsites,withover700platformsin2012Inthispaper,wemodeltheprobl...
CausalBanditswithPropagatingInferenceAkihiroYabe1DaisukeHatano2HannaSumita3ShinjiIto1NaonoriKakimura4TakuroFukunaga2Ken-ichiKawarabayashi5AbstractexploringtheoptimalarmA∗∈A.Theefficiencyofthestra...
BestArmIdentificationinLinearBanditswithLinearDimensionDependencyChaoTao1Sau´lA.Blanco1YuanZhou123Abstract2017a;2014;Kalyanakrishnan&Stone,2010;Zhouetal.,2014)).Westudythebestarmidentificationprob...
BanditswithDelayed,AggregatedAnonymousFeedbackCiaraPike-Burke1ShipraAgrawal2CsabaSzepesvári34SteffenGrünewälder1AbstractoftheKpossiblearms.IntheclassicstochasticMABset-ting,theplayerimmediatelyo...
AdaptiveExploration-ExploitationTradeoffforOpportunisticBanditsHuasenWu1XueyingGuo2XinLiu2AbstractMotivatingscenario1:pricevariation.MABhasbeenwidelyusedinstudyingeffectiveproceduresandtreatmentsIn...
OnContext-DependentClusteringofBanditsClaudioGentile1ShuaiLi2PurushottamKar3AlexandrosKaratzoglou4GiovanniZappella5EvansEtrue1Abstractmovierecommendationsystem,wherethecatalogisrela-tivelystaticand...
OnKernelizedMulti-armedBanditsSayakRayChowdhury1AdityaGopalan1Abstractanceexplorationandexploitation,asavailableknowledgemustbetransferredefficientlyfromafinitesetofobser-Weconsiderthestochasticban...
Multi-objectiveBandits:OptimizingtheGeneralizedGiniIndexRo´bertBusa-Fekete1Bala´zsSzo¨re´nyi23PaulWeng45ShieMannor3Abstracttheagenthastotackletheclassicalexploration/exploitationdilemma:Ithasto...