PrincipledExplorationviaOptimisticBootstrappingandBackwardInductionChenjiaBai1LingxiaoWang2LeiHan3JianyeHao4AnimeshGarg5PengLiu1ZhaoranWang2Abstract2007;Jinetal.,2018)isaprincipledapproachforeffici...
ConstrainedMarkovDecisionProcessesviaBackwardValueFunctionsHarshSatija123PhilipAmortila12JoellePineau123Abstractalgorithmshasbeenlimitedtosimulators,wherethelearn-ingalgorithmhastheabilitytoresetth...