PolicyGradientBayesianRobustOptimizationforImitationLearningZaynahJaved1DanielS.Brown1SatvikSharma1JerryZhu1AshwinBalakrishna1MarekPetrik2AncaD.Dragan1KenGoldberg1Abstracthuman-designedrewardfuncti...
ImitationbyPredictingObservationsAndrewJaegle1YurySulsky1ArunAhuja1JakeBruce1RobFergus1GregWayne1Abstract2009;Huberetal.,2009).Whilemostalgorithmsforimita-tionlearningassumethatdemonstrationscontai...
HyperparameterSelectionforImitationLearningLe´onardHussenot12MarcinAndrychowicz1DamienVincent1RobertDadashi1AntonRaichuk1LukaszStafiniak1SertanGirgin1RaphaelMarinier1NikolaMomchev1SabelaRamos1Manu...
Keyframe-FocusedVisualImitationLearningChuanWen1JieruiLin2JianingQian3YangGao14DineshJayaraman3Abstractthedemonstrationdata.WhileBChaswell-documenteddistributionalshiftissuesduetocompoundingimitati...
Demonstration-ConditionedReinforcementLearningforFew-ShotImitationThéoCachet1JulienPerez1ChristopherR.Dance1AbstractFigure1.TheproposedDCRLalgorithm,whichusesbothexpertdemonstrationsandenvironment...
Cross-domainImitationfromObservationsDriptaS.Raychaudhuri1SujoyPaul2†JeroenvanBaar3AmitK.Roy-Chowdhury1AbstractExpertdomainProxytaskInferencetaskImitationlearningseekstocircumventthediffi-Transfor...
AdversarialOption-AwareHierarchicalImitationLearningMingxuanJing1WenbingHuang1FuchunSun†12XiaojianMa3TaoKong4ChuangGan5LeiLi4AbstractlatedbyanOptionmodel(Suttonetal.,1999)orgoal-basedframework(Lee...
VariationalImitationLearningwithDiverse-qualityDemonstrationsVootTangkaratt1BoHan21MohammadEmtiyazKhan1MasashiSugiyama13Abstractanassumptionthatdiversityiscausedbynoise-densities.Learningfromdemons...
SafeImitationLearningviaFastBayesianRewardInferencefromPreferencesDanielS.Brown1RussellColeman12RaviSrinivasan2ScottNiekum1Abstractdemonstrations,itisimportantforanagenttobeabletoprovidehigh-confid...
ProvableRepresentationLearningforImitationLearningviaBi-levelOptimizationSanjeevArora12SimonS.Du2ShamKakade3YupingLuo1NikunjSaunshi1AbstractMarkovdecisionprocesses(MDPs)thatsharethesamestateandacti...
IntrinsicRewardDrivenImitationLearningviaGenerativeModel2020.02.05XingruiYu1YuemingLyu1IvorW.Tsang1AbstractBeyondExpertImitationlearninginahigh-dimensionalenviron-ExpertLevelmentischallenging.Mosti...
GenerativeAdversarialImitationLearningwithNeuralNetworkParameterization:GlobalOptimalityandConvergenceRateYufengZhang1QiCai1ZhuoranYang2ZhaoranWang1Abstractoptimalpolicy.IRLformulatesILasabilevelop...
AnImitationLearningApproachforCacheReplacementEvanZheranLiu12MiladHashemi2KevinSwersky2ParthasarathyRanganathan2JunwhanAhn2AbstractCacheEvictABDABDAccessesABCProgramexecutionspeedcriticallydependso...
RandomExpertDistillation:ImitationLearningviaExpertPolicySupportEstimationRuohanWang1CarloCiliberto1PierluigiV.Amadori1YiannisDemiris1Abstract2016).Despiteitssimplicity,BCtypicallyrequiresalargeamo...
ProvablyEfficientImitationLearningfromObservationAloneWenSun1AnirudhVemula1ByronBoots2J.AndrewBagnell3Abstractaction,viasupervisedlearningapproaches(e.g.,DAgger(Rossetal.,2011),AggreVaTe(Ross&Bagne...
ImitationLearningfromImperfectDemonstrationYueh-HuaWu12NontawatCharoenphakdee32HanBao32VootTangkaratt2MasashiSugiyama23Abstractmaximumentropy(Ziebartetal.,2008).Imitationlearning(IL)aimstolearnanop...
CompILE:CompositionalImitationLearningandExecutionThomasKipf1†YujiaLi2HanjunDai3†ViniciusZambaldi2AlvaroSanchez-Gonzalez2EdwardGrefenstette4#PushmeetKohli2PeterBattaglia2AbstractLatentcode(perseg...
HierarchicalImitationandReinforcementLearningHoangM.Le1NanJiang2AlekhAgarwal2MiroslavDud´ık2YisongYue1HalDaume´III32AbstractficiencyinRLoverlongtimehorizonsistoexploithierar-chicalstructureofthe...
End-to-EndDifferentiableAdversarialImitationLearningNirBaram1OronAnschel1ItaiCaspi1ShieMannor1Abstract1991).Byprovidingconstantsupervision(denserewardsignalinReinforcementLearning(RL)terminology),B...