PureExplorationandRegretMinimizationinMatchingBanditsFloreSentenac1JialinYi2Cle´mentCalauze`nes3VianneyPerchet4MilanVojnovic´2Abstractonlineadvertising,wheretheprobabilitythatauserclicksonanaddep...
OptimalRegretalgorithmforPseudo-1dBanditConvexOptimizationAadirupaSaha1NagarajanNatarajan2PraneethNetrapalli23PrateekJain23Abstracttheproblemhasa"pseudo-1d"structureinthelossfunc-tionsft(w)=t(gt(w;...
RegretMinimizationinStochasticNon-ConvexLearningviaaProximal-GradientApproachNadavHallak1PanayotisMertikopoulos2VolkanCevher3Abstractproblems,andtheycanadapttodifferentmeasuresofRegretunderdifferen...
RegretandCumulativeConstraintViolationAnalysisforOnlineConvexOptimizationwithLongTermConstraintsXinleiYi1XiuxianLi2TaoYang3LihuaXie4TianyouChai3KarlH.Johansson1Abstractcationsinonlinebinaryclassifi...
Non-ExponentiallyWeightedAggregation:RegretBoundsforUnboundedLossFunctionsPierreAlquier1Abstractthesub-g√radientoftcanbeused.SuchstrategiesleadtoRegretinTundertheadditionalassumptionthatthetareWet...
LogarithmicRegretforReinforcementLearningwithLinearFunctionApproximationJiafanHe1DongruoZhou1QuanquanGu1AbstractAcommonapproachtocopewithhigh-dimensionalstateandactionspacesistoutilizefunctionappro...
LenientRegretandGood-ActionIdentificationinGaussianProcessBanditsXuCai1SelwynGomes1JonathanScarlett12Abstractgorithmscanoftenbeappliedinaunifiedmannerinthesetwosettings.Inthispaper,westudytheproble...
ImprovedRegretBoundandExperienceReplayinRegularizedPolicyIterationNevenaLazic´1DongYin1YasinAbbasi-Yadkori1CsabaSzepesva´ri12AbstractproposedbyEven-Daretal.(2009),wheretheagentse-lectspoliciesbyr...
ImprovedRegretBoundsofBilinearBanditsusingActionSpaceAnalysisKyoungseokJang1Kwang-SungJun2Se-YoungYun3WanmoKang1Abstractarrangecouplesbasedontheirexperiencestogetbetterrat-ingsandrewards.Balancinge...
CollaborativeBayesianOptimizationwithFairRegretRachaelHweeLingSim1YehongZhang2BryanKianHsiangLow1PatrickJaillet3Abstractperformancebysequentiallyselectinginputqueriesforeval-uatingtheobjectivefunct...
BayesianOptimisticOptimisationwithExponentiallyDecayingRegretHungTran-The1SunilGupta1SantuRana1SvethaVenkatesh1Abstracttransformaglobaloptimisationproblemintoasequenceofauxiliaryoptimisationproblem...
Beyondlog2(T)RegretforDecentralizedBanditsinMatchingMarketsSoumyaBasu1KarthikAbinavSankararaman2AbishekSankararaman3Abstractbanditsisdedicatedtounderstandingalgorithmicprinciplesintheinterplayofcom...
ARegretMinimizationApproachtoIterativeLearningControlNamanAgarwal1EladHazan12AnirudhaMajumdar12KaranSingh3Abstractoffactors.Theprimarychallengewefocusoninthispa-peristheexistenceofunmodeleddeviatio...
StochasticRegretMinimizationinExtensive-FormGamesGabrieleFarina1ChristianKroer2TuomasSandholm1345AbstractTypically,EFGmodelsareoperationalizedbycomputingeitheraNashequilibriumofthegame,oranapproxim...
Near-optimalRegretBoundsforStochasticShortestPathAlonCohen1HaimKaplan12YishayMansour12AvivRosenberg2AbstractThefocusofthisworkisonRegretminimizationinSSP.Itbuildsonextensiveliteratureontheoreticala...
LogarithmicRegretforAdversarialOnlineControlDylanJ.Foster1MaxSimchowitz2Abstractbyawell-behavedstochasticprocessordrivenbyaworst-caseprocesstowhichthelearnermustremainrobustinWeintroduceanewalgorit...
LogarithmicRegretforLearningLinearQuadraticRegulatorsEfficientlyAsafCassel1AlonCohen2TomerKoren1Abstract√O(T)Regretboundforthissettingalbeitwithacomputa-WeconsidertheproblemoflearninginLin-tionall...
ImprovedBoundsonMinimaxRegretunderLogarithmicLossviaSelf-ConcordanceBlairBilodeau123DylanJ.Foster4DanielM.Roy123AbstractTheloglosspenalizestheplayerbasedonhowmuchprob-abilitymasstheyplaceontheactua...
AnewRegretanalysisforAdam-typealgorithmsAhmetAlacaoglu1YuraMalitsky1PanayotisMertikopoulos23VolkanCevher1AbstractOnecanwonderwhetherthereisaninherentobstacle–intheproposedmethodsorthesetting–whic...
TighterProblem-DependentRegretBoundsinReinforcementLearningwithoutDomainKnowledgeusingValueFunctionBoundsAndreaZanette1EmmaBrunskill2AbstractFortunatelyinpracticereinforcementlearningalgorithmsof-t...