IsPessimismProvablyEfficientforOfflineRL?YingJin1ZhuoranYang2ZhaoranWang3AbstractVinyalsetal.,2017)reliesontwoingredients:(i)expressivefunctionapproximators,e.g.,deepneuralnetworks(LeCunWestudyoffl...
CombiningPessimismwithOptimismforRobustandEfficientModel-BasedDeepReinforcementLearningSebastianCuri1IlijaBogunovic1AndreasKrause1Abstractunpredictableways.Themaingoalisthentolearnapolicythatprovab...