UnsupervisedSkillDiscoverywithBottleneckOptionLearningJaekyeomKim1SeohongPark1GunheeKim1Abstractlearnedskillscanencouragetheexplorationforencounter-ingrewards,notonlybyprovidingusefulprimitivesfort...
Data-efficientHindsightOff-policyOptionLearningMarkusWulfmeier1DushyantRao1RolandHafner1ThomasLampe1AbbasAbdolmaleki1TimHertweck1MichaelNeunert1DhruvaTirumala1NoahSiegel1NicolasHeess1MartinRiedmill...
OptionDiscoveryintheAbsenceofRewardswithManifoldAnalysisAmitayBar1RonenTalmon1RonMeir1Abstractthegraphedgesrepresentthestatesconnectivity.Suchanapproachledtotheintroductionofproto-valuefunctionsOpt...
Per-DecisionOptionDiscountingAnnaHarutyunyan12PeterVrancx32PhilippeHamel1AnnNowe´2DoinaPrecup1AbstractThediscountfactorγitselfisusuallytreatedassomethinginbetweenamathematicalconvenienceandameani...
ALaplacianFrameworkforOptionDiscoveryinReinforcementLearningMarlosC.Machado1MarcG.Bellemare2MichaelBowling1Abstracttheoptimalpolicyforthatrewardfunction.InthispaperweintroduceanalgorithmforOptiondi...