"Q-learning"的相关文档

标签“Q-learning”的相关文档，共10条

UCB Momentum Q-learning Correcting the bias without forgetting
UCBMomentumQ-learning:CorrectingthebiaswithoutforgettingPierreMénard1OmarDarwicheDomingues2XuedongShang23MichalValko234Abstractbalancetheexplorationoftheenvironmentandexploitationofthecurrentknowl...
the Momentum without Bias Correcting
2023-11-16 19:42:1513212.56 MB22
下载文档
Ensemble Bootstrapping for Q-learning
EnsembleBootstrappingforQ-learningOrenPeer1ChenTessler1NadavMerlis1RonMeir1Abstractfocusesonlearningthevalue-function.Thevaluerepresentstheexpected,discounted,reward-to-gothattheagentwillQ-learning...
for Ensemble Q-learning Bootstrapping
2023-11-16 18:38:015922.25 MB26
下载文档
EMaQ Expected-Max Q-learning Operator for Simple Yet Effective Offline and Online RL
EMaQ:Expected-MaxQ-learningOperatorforSimpleYetEffectiveOfﬂineandOnlineRLSeyedKamyarSeyedGhasemipour12DaleSchuurmans3ShixiangShaneGu3Abstract1.IntroductionOff-policyreinforcementlearning(RL)holdst...
for Operator simple Q-learning EMaQ
2023-11-16 18:38:001495725.77 KB25
下载文档
Multi-Agent Determinantal Q-learning
Multi-AgentDeterminantalQ-learningYaodongYang12YingWen12LihengChen3JunWang2KunShao1DavidMguni1WeinanZhang3AbstractAfullspectrumofMARLalgorithmshasbeendevelopedtosolvecooperativetasks(Panait&Luke,20...
Multi-Agent Determinantal Q-learning
2023-11-14 21:45:1513791.86 MB5
下载文档
Lookahead-Bounded Q-learning
Lookahead-BoundedQ-learningIbrahimElShar1DanielR.Jiang1Abstractinthefollowingsense:writingthetransitiondynamicsasst+1=f(st,at,wt+1),wherestandatarethecurrentWeintroducethelookahead-boundedQ-learnin...
Q-learning Lookahead-Bounded
2023-11-14 21:45:0518951.06 MB27
下载文档
ConQUR Mitigating Delusional Bias in Deep Q-learning
ConQUR:MitigatingDelusionalBiasinDeepQ-learningDiJia(Andy)Su12JaydenOoi1TylerLu1DaleSchuurmans13CraigBoutilier1Abstract&Smart,2004;Melo&Ribeiro,2007;Maeietal.,2010;Munosetal.,2016);butitremainsdif...
Deep in Bias Mitigating Q-learning
2023-11-14 21:43:3411041.06 MB4
下载文档
A Finite-Time Analysis of Q-learning with Neural Network Function Approximation
AFinite-TimeAnalysisofQ-learningwithNeuralNetworkFunctionApproximationPanXu1QuanquanGu1AbstractwhichtriggersalineofresearchondeepreinforcementlearningsuchasDoubleDeepQ-learning(VanHasseltQ-learning...
of Neural with Analysis Network
2023-11-14 17:19:24841316.85 KB22
下载文档
Sample-Optimal Parametric Q-learning Using Linearly Additive Features
Sample-OptimalParametricQ-learningUsingLinearlyAdditiveFeaturesLinF.Yang1MengdiWang1Abstractthistheoretical-sharpresultdoesnotgeneralizetopracticalproblemswhereS,Acanbearbitrarilylargeorinﬁnite.Co...
Using Features Parametric Q-learning Additive
2023-11-13 14:48:28843343.22 KB19
下载文档
Making Deep Q-learning methods robust to time discretization
MakingDeepQ-learningMethodsRobusttoTimeDiscretizationCorentinTallec1Le´onardBlier12YannOllivier2Abstractpreventstransferfromimperfectsimulatorstorealworldscenarios.Despiteremarkablesuccesses,DeepR...
Deep Methods Robust Time Making
2023-11-13 14:47:4816031.18 MB12
下载文档
Diagnosing Bottlenecks in Deep Q-learning Algorithms
DiagnosingBottlenecksinDeepQ-learningAlgorithmsJustinFu1AviralKumar1MatthewSoh1SergeyLevine1AbstractwhichpotentialissueswithQ-learningmanifestinpractice.WeempiricallyanalyzeaspectsoftheQ-learningme...
Algorithms Deep in Diagnosing Bottlenecks
2023-11-13 14:46:528062.17 MB29
下载文档

首页上页 1 下页尾页