LearningandPlanninginComplexActionSpacesThomasHubert1JulianSchrittwieser1IoannisAntonoglou1MohammadaminBarekatain1SimonSchmitt1DavidSilver1Abstractreal-worldproblems.Manyimportantreal-worldproblems...
CompositionalVideoSynthesiswithActionGraphsAmirBar1RoeiHerzig1XiaolongWang23AnnaRohrbach3GalChechik45TrevorDarrell3AmirGloberson1AbstractFigure1.WeproposeanewtaskcalledActionGraphtoVideo.Torepresen...
ImprovedSleepingBanditswithStochasticActionsSetsandAdversarialRewardsAadirupaSaha1PierreGaillard2MichalValko3Abstractetal.,2012).Howeverinvariousrealworldapplications,thedecisionspace(setofarmsA)of...
GrowingActionSpacesGregoryFarquhar12LauraGustafson3ZemingLin3ShimonWhiteson1NicolasUsunier3GabrielSynnaeve3Abstracttrolovertheenvironment.Inthiswork,weinsteadmakeuseofcurriculathatareinternaltothea...
ControlFrequencyAdaptationviaActionPersistenceinBatchReinforcementLearningAlbertoMariaMetelli1FlavioMazzolini1LorenzoBisi12LucaSabbioni12MarcelloRestelli12Abstractcontinuous–timeMDPs(Bradtke&Duff,...
LearningActionRepresentationsforReinforcementLearningYashChandak1GeorgiosTheocharous2JamesE.Kostas1ScottM.Jordan1PhilipS.Thomas1AbstractFigure1.Thestructureoftheproposedoverallpolicy,πo,consist-in...
BayesianActionDecoderforDeepMulti-AgentReinforcementLearningJakobN.Foerster12H.FrancisSong3EdwardHughes3NeilBurch3IainDunning3ShimonWhiteson1MatthewM.Botvinick3MichaelBowling3Abstract1.Introduction...
ActionRobustReinforcementLearningandApplicationsinContinuousControlChenTessler1YonathanEfroni1ShieMannor1AbstractTheadvantageofrobustpoliciesishighlightedwhencon-sideringimperfectmodels,acommonscen...
SmoothedActionValueFunctionsforLearningGaussianPoliciesOfirNachum1MohammadNorouzi1GeorgeTucker1DaleSchuurmans12Abstracthard-maxnotionofQ-value,definedastheexpectedreturnoffollowinganoptimalpolicy.S...
ReinforcementLearningwithFunction-ValuedActionSpacesforPartialDifferentialEquationControlYangchenPan12Amir-massoudFarahmand32MarthaWhite1SalehNabi2PiyushGrover2DanielNikovski2Abstractnamicsystem(Li...
DeepReinforcementLearninginContinuousActionSpaces:aCaseStudyintheGameofSimulatedCurlingKyowoonLee1Sol-AKim1JaesikChoi1Seong-WhanLee2Abstract1992),andothello(Buro,1999).Recently,deepconvolu-tionalne...