SafeReinforcementLearningwithLinearFunctionApproximationSanaeAmani1ChristosThrampoulidis2LinF.Yang1Abstractactionmayleadtocatastrophicresults.Thus,SafetyinRLhasbecomeaseriousissuethatrestrictstheap...
SafeReinforcementLearningUsingAdvantage-BasedInterventionNolanWagener1ByronBoots2Ching-AnCheng3AbstractFigure1.Advantage-basedinterventionofSAILRandconstruc-tionofthesurrogateMDPM.InM,wheneverthepo...
CRPO:ANewApproachforSafeReinforcementLearningwithConvergenceGuaranteeTengyuXu1YingbinLang1GuanghuiLan2AbstractMind,2019)andrecommendationsystem(Zhengetal.,2018),etc.Inthesesettings,theagentisallowe...
AcceleratingSafeReinforcementLearningwithConstraint-mismatchedBaselinePoliciesTsung-YenYang1JustinianRosca2KarthikNarasimhan1PeterJ.Ramadge1Abstractorothercosts.Forinstance,whenyoudriveanunfamiliar...
SafeImitationLearningviaFastBayesianRewardInferencefromPreferencesDanielS.Brown1RussellColeman12RaviSrinivasan2ScottNiekum1Abstractdemonstrations,itisimportantforanagenttobeabletoprovidehigh-confid...
SafeScreeningRulesfor0-RegressionfromPerspectiveRelaxationsAlperAtamtu¨rk1Andre´sGo´mez2Abstract2015),andthe2(ridge)regularization(Hoerl&Kennard,1970)imposesbias/shrinkageintheregressioncoeffici...
SafeReinforcementLearninginConstrainedMarkovDecisionProcessesAkifumiWachi1YananSui2Abstractessentialrequirement,theprimaryobjectiveisnonethelesstoobtainrewards(e.g.,scientificgain).Safereinforcemen...
SafeDeepSemi-SupervisedLearningforUnseen-ClassUnlabeledDataLan-ZheGuo1Zhen-YuZhang1YuanJiang1Yu-FengLi1Zhi-HuaZhou1AbstractFigure1.Oneexampleofclassdistributionmismatch.Unlabeleddatacontainsclasses...
FastOSCARandOWLRegressionviaSafeScreeningRulesRunxueBao1BinGu2HengHuang12Abstractwithoutanypriorinformationoffeaturegroups.Remark-ably,(Buetal.,2019)concludedthatithastwogoodproper-OrderedWeightedL...
SafeGridSearchwithOptimalComplexityEugeneNdiaye1TamLe1OlivierFercoq2JosephSalmon3IchiroTakeuchi4Abstractthefirstpart(trainingset)themethodistrainedforapre-definedcollectionofcandidatesΛT:={λ0,......
SafePolicyImprovementwithBaselineBootstrappingRomainLaroche1PaulTrichelair1RemiTachetdesCombes1AbstractisakeychallengeofmodernRLthatneedstobetackledbeforeanywide-scaleadoption.ThispaperconsidersSaf...
AdaptiveandSafeBayesianOptimizationinHighDimensionsviaOne-DimensionalSubspacesJohannesKirschner1Mojm´ırMutny´1NicoleHiller2RasmusIschebeck2AndreasKrause1Abstract5finalvalueBayesianoptimizationis...
StagewiseSafeBayesianOptimizationwithGaussianProcessesYananSui1VincentZhuang1JoelW.Burdick1YisongYue1AbstractManyoftheseapplicationsarealsosubjecttoavarietyofSafetyconstraints,sothatactionscannotbe...
SafeElementScreeningforSubmodularFunctionMinimizationWeizhongZhang1BinHong2LinMa1WeiLiu1TongZhang1Abstractwithconvexfunctions.Theyarisenaturallyinmanydomain-s,suchasclustering(Narasimhan&Bilmes,200...