ValueIterationinContinuousActions,StatesandTimeMichaelLutter12ShieMannor13JanPeters2DieterFox14AnimeshGarg15AbstractValueIterationFittedValueIterationContinuousFittedValueIterationClassicalvalueite...
PIDAcceleratedValueIterationAlgorithmAmir-massoudFarahmand12MohammadGhavamzadeh3Abstractapproximationofthevalueoraction-valuefunctions,i.e.,Vk+1←TπVkorQk+1←T∗Qk.FordiscountedMDPs,Theconvergence...
StructuredPolicyIterationforLinearQuadraticRegulatorYoungsukPark1RyanA.Rossi2ZhengWen3GangWu2HandongZhao2Abstractson&Moore,2007)spanningseveraldecades.Linearquadraticregulator(LQR)isoneoftheThissto...
OntheIterationComplexityofHypergradientComputationRiccardoGrazzi12LucaFranceschi12MassimilianoPontil12SaverioSalzo1Abstractetal.,2018),aswellasrecurrentandgraphneuralnetworks(Almeida,1987;Pineda,19...
Multi-AgentRoutingValueIterationNetworkQuinlanSykoraMengyeRenRaquelUrtasunAbstractFigure1.Avisualizationoftherouteproducedbyafleetoftwentyvehiclesusingourproposedalgorithm.ColorsdenotedifferentInth...
ProjectionsforApproximatePolicyIterationAlgorithmsRiadAkrour1JoniPajarinen12GerhardNeumann34JanPeters15Abstractdient,akeybreakthroughwastheuseofnaturalgradientthatfollowsthesteepestdescentinbehavio...
POLITEX:RegretBoundsforPolicyIterationUsingExpertPredictionYasinAbbasi-Yadkori1PeterL.Bartlett2KushBhatia2NevenaLazic´3CsabaSzepesvári4GellértWeisz4Abstractmodel-basedalgorithms,andtheoreticalev...
TensorDecompositionviaSimultaneousPowerIterationPo-AnWang1Chi-JenLu1Abstractandinfactseveralproblemsrelatedtotensordecomposi-tionareknowntobeNP-hard(Hillar&Lim,2013).Nev-Tensordecompositionisanimpo...
OntheIterationComplexityofSupportRecoveryviaHardThresholdingPursuitJieShen1PingLi1Abstract2010;Blumensath&Davies,2009;Bouchotetal.,2016).RecoveringthesupportofasparsesignalfromComparedtoparameteres...