ReinforcementLearningforCost-AwareMarkovDecisionProcessesWesleyA.Suttle1KaiqingZhang2ZhuoranYang3DavidN.Kraemer1JiLiu4Abstractquentlyusedinpractice.Nevertheless,alternativeobjectiveshaveseenincreas...
OnlineLearninginUnknownMarkovGamesYiTian1YuanhaoWang2TianchengYu1SuvritSra1Abstractcontrolboth/allplayersandaimtominimizethenumberofepisodesrequiredtofindagoodpolicy;and(2)theonlineWestudyonlinelea...
LearningandPlanninginAverage-RewardMarkovDecisionProcessesYiWan1AbhishekNaik1RichardS.Sutton12Abstractwithit.Forlearningandcombinedmethods,bothcontrolandpredictionproblemscanbefurthersubdividedinto...
EstimatingIdentifiableCausalEffectsonMarkovEquivalenceClassthroughDoubleMachineLearningYonghanJung1JinTian2EliasBareinboim3Abstracttigateswhether,givenacausalgraphGencodingqualitativeknowledgeabout...
SafeReinforcementLearninginConstrainedMarkovDecisionProcessesAkifumiWachi1YananSui2Abstractessentialrequirement,theprimaryobjectiveisnonethelesstoobtainrewards(e.g.,scientificgain).Safereinforcemen...
ReinforcementLearningforNon-StationaryMarkovDecisionProcesses:TheBlessingof(More)OptimismWangChiCheung1DavidSimchi-Levi2RuihaoZhu2Abstractimizesitscumulativerewards,whilefacingthefollowingchallenge...
PrivatelyLearningMarkovRandomFieldsHuanyuZhang1GautamKamath2JanardhanKulkarni3ZhiweiStevenWu4Abstractanexponentialsamplecomplexityinp),MarkovRandomFields(MRFs)areaparticularfamilyofundirectedgraphi...
Model-freeReinforcementLearninginInfinite-horizonAverage-rewardMarkovDecisionProcessesChen-YuWei1MehdiJafarnia-Jahromi1HaipengLuo1HiteshiSharma1RahulJain1Abstractandmodel-free.Model-basedalgorithms...
LazyIter:AFastAlgorithmforCountingMarkovEquivalentDAGsandDesigningExperimentsAliAhmadiTeshnizi1SaberSalehkaleybar1NegarKiyavash2AbstractvariableXisadirectcauseofvariableY.Underthefaith-fulnessassum...
LearningAdversarialMarkovDecisionProcesseswithBanditFeedbackandUnknownTransitionChiJin1TianchengJin2HaipengLuo2SuvritSra3TianchengYu3AbstractThemajorityoftheliteratureinlearningMDPsassumesstationar...
FastandConsistentLearningofHiddenMarkovModelsbyIncorporatingNon-ConsecutiveCorrelationsRobertMattila1CristianR.Rojas1EricMoulines23VikramKrishnamurthy4BoWahlberg1Abstractcomputationalbiology(Durbin...
DoestheMarkovDecisionProcessFittheData:TestingfortheMarkovPropertyinSequentialDecisionMakingChengchunShi1RunzheWan2RuiSong2WenbinLu2LingLeng3Abstract1.1.ContributionsandadvancesofourtestTheMarkovas...
ConstrainedMarkovDecisionProcessesviaBackwardValueFunctionsHarshSatija123PhilipAmortila12JoellePineau123Abstractalgorithmshasbeenlimitedtosimulators,wherethelearn-ingalgorithmhastheabilitytoresetth...
ConsistentStructuredPredictionwithMax-MinMarginMarkovNetworksAlexNowak-Vila1FrancisBach1AlessandroRudi1Abstractdictionmistakesarenotequallycostly.Insequencepre-diction,forinstance,thenumberofpossib...
OnlineConvexOptimizationinAdversarialMarkovDecisionProcessesAvivRosenberg1YishayMansour12AbstractWeproposeanovelalgorithmfortheadversarialMDPmodelwherethetransitionfunctionisunknowntotheWeconsidero...
Moment-BasedVariationalInferenceforMarkovJumpProcessesChristianWildner1HeinzKoeppl1Abstractalsoapplyacontinuousversionoftheclassicalforward-backwardalgorithmforhiddenMarkovmodels.Inthecon-Wepropose...
LearningtoCollaborateinMarkovDecisionProcessesGoranRadanovic1RatiDevidze2DavidC.Parkes1AdishSingla2AbstractWeexpectthatusefulcollaborationwillcomeaboutthroughAIagentsthatcanadapttothebehaviorofuser...
Kernel-BasedReinforcementLearninginRobustMarkovDecisionProcessesShiauHongLim1ArnaudAutef2AbstractThisclassincludeskernelaveraging,k-nearest-neighbor,weightedk-nearestneighbor,Bezierpatches,linearin...