Stable-PredictiveOptimisticCounterfactualRegretMinimizationGabrieleFarina1ChristianKroer2NoamBrown1TuomasSandholm1345Abstractwereusedasanessentialingredientforallrecentmilestonesinthebenchmarkdomai...
RegretCircuits:ComposabilityofRegretMinimizersGabrieleFarina1ChristianKroer2TuomasSandholm1345Abstractvariants,alongwithotherscalabilitytechniquessuchasreal-timeendgamesolving(Ganzfried&Sandholm,20...
POLITEX:RegretBoundsforPolicyIterationUsingExpertPredictionYasinAbbasi-Yadkori1PeterL.Bartlett2KushBhatia2NevenaLazic´3CsabaSzepesvári4GellértWeisz4Abstractmodel-basedalgorithms,andtheoreticalev...
DeepCounterfactualRegretMinimizationNoamBrown12AdamLerer1SamGross1TuomasSandholm23Abstractintwo-playerzero-sumgames.FormsoftabularCFRhavebeenusedinallrecentmilestonesinthebenchmarkdomainCounterfact...
CautiousRegretMinimization:OnlineOptimizationwithLong-TermBudgetConstraintsNikolaosLiakopoulos12ApostolosDestounis1GeorgiosPaschos1ThrasyvoulosSpyropoulos2PanayotisMertikopoulos3Abstractafunctionof...
AdaptiveRegretofConvexandSmoothFunctionsLijunZhang1Tie-YanLiu2Zhi-HuaZhou1Abstractreal-worldapplication,wearealsofacinganotherdynamicchallenge—theoptimalsolutionmaychangecontinuously.Weinvestigate...
TightRegretBoundsforBayesianOptimizationinOneDimensionJonathanScarlett1Abstract2010),whoconsiderthecumulativeRegret:WeconsidertheproblemofBayesianoptimiza-Ttion(BO)inonedimension,underaGaussianproc...
RegretMinimizationforPartiallyObservableDeepReinforcementLearningPeterJin1KurtKeutzer1SergeyLevine1Abstractfunction-basedmethods.Somepolicygradientmethodssuchasadvantageactor-critic(Mnihetal.,2016)...
MaketheMinorityGreatAgain:First-OrderRegretBoundforContextualBanditsZeyuanAllen-Zhu1Se´bastienBubeck1YuanzhiLi2Abstract•Theadversaryselectsalossfunctiont:[K]→[0,1].Regretboundsinonlinelearningco...
ImprovedRegretBoundsforThompsonSamplinginLinearQuadraticControlProblemsMarcAbeille1AlessandroLazaric2Abstracthasbeenmostlyaddressedfollowingtwomainapproaches:optimism-in-face-of-uncertainty(OFU)and...
DynamicRegretofStronglyAdaptiveMethodsLijunZhang1TianbaoYang2RongJin3Zhi-HuaZhou1Abstractincurredbythelearnerandthatofthebestfixeddecisioninhindsight,i.e.,Tocopewithchangingenvironments,recentde-ve...
RegretMinimizationinBehaviorally-ConstrainedZero-SumGamesGabrieleFarina1ChristianKroer1TuomasSandholm1Abstractset,andinstantiatingastandardRegretminimizerateachinformationsetinordertominimizelocalr...
Near-OptimalDesignofExperimentsviaRegretMinimizationZeyuanAllen-Zhu1YuanzhiLi2AartiSingh3YiningWang3AbstractonewishestoselectknexperimentalsettingsfromXthatarethemoststatisticallyefficientforestabl...
MinimaxRegretBoundsforReinforcementLearningMohammadGheshlaghiAzar1IanOsband1RémiMunos1AbstractThemostcommonapproachtothislearningproblemistoseparatetheprocessofestimationandoptimization.Weconsider...
EfficientRegretMinimizationinNon-ConvexGamesEladHazan1KaranSingh1CyrilZhang1AbstractInthispaperweinvestigatethegeneralizationofthenon-convexstatistical,orbatch,learningmodeltoonlinelearn-Weconsider...