TheIntrinsicRobustnessofStochasticBanditstoStrategicManipulationZheFeng1DavidC.Parkes1HaifengXu2Abstractabletomodulateitsownrewardfeedbackinordertofurtheritsownobjective,e.g.,increasingthenumberoft...
IntrinsicRewardDrivenImitationLearningviaGenerativeModel2020.02.05XingruiYu1YuemingLyu1IvorW.Tsang1AbstractBeyondExpertImitationlearninginahigh-dimensionalenviron-ExpertLevelmentischallenging.Mosti...
Circuit-BasedIntrinsicMethodstoDetectOverfittingSatrajitChatterjee1AlanMishchenko2Abstractknowledge,suchas,theperformanceofthemodelonexam-plesheldoutfromthetrainingprocess,detailsoftheprocessThefoc...
SocialInfluenceasIntrinsicMotivationforMulti-AgentDeepReinforcementLearningNatashaJaques12AngelikiLazaridou2EdwardHughes2CaglarGulcehre2PedroA.Ortega2DJStrouse3JoelZ.Leibo2NandodeFreitas2Abstractac...