ReinforcementLearningwithDeepEnergy-BasedPoliciesTuomasHaarnoja1HaoranTang2PieterAbbeel134SergeyLevine1Abstractstochasticpoliciesaredesirableforexploration,thisex-plorationistypicallyattainedheuris...
NeuralOptimizerSearchwithReinforcementLearningIrwanBello1BarretZoph1VijayVasudevan1QuocV.Le1AbstractFigure1.AnoverviewofNeuralOptimizerSearch.Wepresentanapproachtoautomatetheprocessrentnetworkcontr...
ModularMultitaskReinforcementLearningwithPolicySketchesJacobAndreas1DanKlein1SergeyLevine1Abstractτ1:makeplanksΠ1τ2:makesticksΠ2b1:getwoodK1π1Wedescribeaframeworkformultitaskdeepre-b2:useworkb...
MinimaxRegretBoundsforReinforcementLearningMohammadGheshlaghiAzar1IanOsband1RémiMunos1AbstractThemostcommonapproachtothislearningproblemistoseparatetheprocessofestimationandoptimization.Weconsider...
FeUdalNetworksforHierarchicalReinforcementLearningAlexanderSashaVezhnevets1SimonOsindero1TomSchaul1NicolasHeess1MaxJaderberg1DavidSilver1KorayKavukcuoglu1Abstractchallenging,sincetheagenthastolearn...
FairnessinReinforcementLearning⇤ShahinJabbariMatthewJosephMichaelKearnsJamieMorgensternAaronRoth1Abstracttingswherehistoricalcontextcanhaveadistinctinfluenceonthefuture.Forconcreteness,weconsidert...
DevicePlacementOptimizationwithReinforcementLearningAzaliaMirhoseini12HieuPham12QuocV.Le1BenoitSteiner1RasmusLarsen1YuefengZhou1NaveenKumar3MohammadNorouzi1SamyBengio1JeffDean1Abstractetal.,2015;Wu...
DeepDecentralizedMulti-taskMulti-AgentReinforcementLearningunderPartialObservabilityShayeganOmidshafiei1JasonPazis1ChristopherAmato2JonathanP.How1JohnVian3Abstractpartialobservabilityandlimitedcomm...
DARLA:ImprovingZero-ShotTransferinReinforcementLearning111111IrinaHigginsArkaPalAndreiRusuLoicMattheyChristopherBurgessAlexanderPritzel111MatthewBotvinickCharlesBlundellAlexanderLerchnerAbstractef...
CounterfactualData-FusionforOnlineReinforcementLearnersAndrewForney1JudeaPearl1EliasBareinboim2AbstractInthiswork,westudytheconditionsunderwhichdatacol-lectedunderheterogeneousconditions(tobedefine...
AnAlternativeSoftmaxOperatorforReinforcementLearningKavoshAsadi1MichaelL.Littman1AbstractAnidealsoftmaxoperatorisaparameterizedsetofoperatorsthat:Asoftmaxoperatorappliedtoasetofvaluesactssomewhatli...