OntheFeasibilityofLearning,RatherthanAssuming,HumanBiasesforRewardInferenceRohinShah1NoahGundotra1PieterAbbeel1AncaD.Dragan1Abstractp(as)µebQ(s,a;r)w!Ourgoalisforagentstooptimizetherightre-wwardfu...