LearningtheRewardFunctionforaMisspecifiedModelErikTalvitie1AbstractFigure1.TheShooterdomain.Inmodel-basedreinforcementlearningitistypi-inMBRL:learningarewardfunction.Itiscommonforcaltodecouplethepr...