StateRelevanceforOff-PolicyEvaluationSimonP.Shen1YechengJasonMa2OmerGottesman3FinaleDoshi-Velez1Abstractimportantasmanydomainshavetrajectorieswithdifferentlengths:inhealthsettings,patients’lengtho...
StateEntropyMaximizationwithRandomEncodersforEfficientExplorationYounggyoSeo1LiliChen2JinwooShin1HonglakLee34PieterAbbeel2KiminLee2AbstractproachesencourageagentstovisitdiverseStates,butleaveunansw...
SampleEfficientReinforcementLearningInContinuousStateSpaces:APerspectiveBeyondLinearityDhruvMalik1AldoPacchiano2VishwakSrinivasan1YuanzhiLi1Abstractsuchabenchmark(Bellemareetal.,2013).Agentstrained...
NeuralPharmacodynamicStateSpaceModelingZeshanHussain1RahulG.Krishnan2DavidSontag1Abstractsub-type(Zhangetal.,2019b).Todothesetaskswell,under-standinghowapatient’sbiomarkersevolveovertimegivenModel...
StateSpaceExpectationPropagation:EfficientInferenceSchemesforTemporalGaussianProcessesWilliamJ.Wilkinson1PaulE.Chang1MichaelRiisAndersen2ArnoSolin1AbstractFiltering/Forwardpass→←Smoothing/Backwar...
KinematicStateAbstractionandProvablyEfficientRich-ObservationReinforcementLearningDipendraMisra1MikaelHenaff1AkshayKrishnamurthy1JohnLangford1Abstractfromthewell-studiedtabularsettingtoexploretheen...
StateAbstractionsforLifelongReinforcementLearningDavidAbel1DilipArumugam1LucasLehnert1MichaelL.Littman1AbstractM<latexitsha1_base64="OX1ier/XMCCLr88ChMp6EICKr2E=">AAAEQnicZVNLb9NAEN4SHsW8WjhyWRGQip...
StateSpaceGaussianProcesseswithNon-GaussianLikelihoodHannesNickisch1ArnoSolin2AlexanderGrigorievskiy23Abstractaddressedbyapproximatecovariancecomputationsusingsparseinducingpointmethods(Quin˜onero...
ScalableBilinearπLearningUsingStateandActionFeaturesYichenChen1LihongLi2MengdiWang3Abstracte.g.,Azaretal.(2013)).Inotherwords,thereisanoraclethattakes(s,a)asinputandoutputsarandomswithprob-Approxi...
RecurrentPredictiveStatePolicyNetworksAhmedHefny1ZitaMarinho23WenSun2SiddharthaS.Srinivasa4GeoffreyGordon1Abstract1.IntroductionWeintroduceRecurrentPredictiveStatePolicyRecently,therehasbeensignifi...
DiscoveringandRemovingExogenousStateVariablesandRewardsforReinforcementLearningThomasDietterich1GeorgeTrimponias2ZhitangChen2Abstractchannel.Thishighdegreeofstochasticitycanconfuserein-forcementlea...