ProvablyEfficientReinforcementLearningforDiscountedMDPswithFeatureMappingDongruoZhou1JiafanHe1QuanquanGu1Abstractlinearfunctionsorneuralnetworkstomapstatesandactionstoalow-dimensionalspaceandsolvet...
OptimizingfortheFutureinNon-StationaryMDPsYashChandak1GeorgiosTheocharous2ShivShankar1MarthaWhite3SridharMahadevan12PhilipS.Thomas1Abstractcreasedfrictionandthus,changeinthesystemdynamics.Similarly...
InvariantCausalPredictionforBlockMDPsAmyZhang123ClareLyle4ShagunSodhani3AngelosFilos4MartaKwiatkowska4JoellePineau123YarinGal4DoinaPrecup125Abstractditionsinaroommaychange,butthephysicaldynamicsoft...
SymbolicNetwork:GeneralizedNeuralPoliciesforRelationalMDPsSankalpGarg1AniketBajpai1Mausam1Abstract1.IntroductionARelationalMarkovDecisionProcess(RMDP)ARelationalMarkovDecisionProcess(RMDP)(Boutilie...
EfficientlySolvingMDPswithStochasticMirrorDescentYujiaJin1AaronSidford1AbstractanMDPgivenonlyrestrictedaccesstothemodel.Inpar-ticular,weconsidertheproblemofcomputingan-optimalInthispaperwepresentau...