DiscoveringOptionsforExplorationbyMinimizingCoverTimeYuuJinnai1JeeWonPark1DavidAbel1GeorgeKonidaris1Abstractoptionsguaranteedtoreducetheexpectedcovertimeusingthetransitionfunctioneithergiventoorlea...
DecentralizedExplorationinMulti-ArmedBanditsRaphaëlFéraud1RédaAlami1RomainLaroche2Abstractviceisconnectingtotheapplication,theapplicationpresentsanoptiontotheuserofthedevice.TheaimistomaximizeWe...
Dead-endsandSecureExplorationinReinforcementLearningMehdiFatemi1ShikharSharma1HarmvanSeijen1SamiraEbrahimiKahou2Abstracthastointeractwiththeenvironmentandlearnfromitsexpe-rience.Therealwaysexistsa(...
Curiosity-Bottleneck:ExplorationbyDistillingTask-SpecificNoveltyYoungjinKim12WontaeNam∗3HyunwooKim∗2Ji-HoonKim4GunheeKim2Abstractcontainsnovelbuttask-irrelevantinformation1.Forexam-ple,supposearo...
TheUncertaintyBellmanEquationandExplorationBrendanO’Donoghue1IanOsband1RemiMunos1VolodymyrMnih1Abstracttionsthatmaximizerewardsgivenitscurrentknowledge?WeconsidertheExploration/exploitationprob-Se...
GEP-PG:DecouplingExplorationandExploitationinDeepReinforcementLearningAlgorithmsCe´dricColas1OlivierSigaud12Pierre-YvesOudeyer1AbstractDeepRLalgorithmsgenerallyconsistinapplyingStochas-ticGradient...
CoordinatedExplorationinConcurrentReinforcementLearningMariaDimakopoulou1BenjaminVanRoy1Abstractandrefinesestimatesasdataisgathered.Atthestartofeachepisode,theagentsamplesanMDPfromitscurrentposte-W...
NeuralTaylorApproximations:ConvergenceandExplorationinRectifierNetworksDavidBalduzzi1BrianMcWilliams2TonyButler-Yeoman1AbstractFig.1:ShatteredgradientsinaPL-function.Modernconvolutionalnetworks,inc...
Curiosity-drivenExplorationbySelf-supervisedPredictionDeepakPathak1PulkitAgrawal1AlexeiA.Efros1TrevorDarrell1Abstract(a)learntoexploreinLevel-1(b)explorefasterinLevel-2Inmanyreal-worldscenarios,rew...
Count-BasedExplorationwithNeuralDensityModelsGeorgOstrovski1MarcG.Bellemare1Aa¨ronvandenOord1Re´miMunos1Abstractbeensuccessfullydemonstratedinanumberofsettings(Deisenroth&Rasmussen,2011;Guezetal....