ImplicitRegularizationinNonconvexStatisticalEstimation:GradientDescentConvergesLinearlyforPhaseRetrievalandMatrixCompletionCongMa1KaizhengWang1YuejieChi2YuxinChen3Abstract1.IntroductionRecentyearsh...
GradNorm:GradientNormalizationforAdaptiveLossBalancinginDeepMultitaskNetworksZhaoChen1VijayBadrinarayanan1Chen-YuLee1AndrewRabinovich1Abstractsuchassmartphones,wearabledevices,androbots/drones.Such...
GradientPrimal-DualAlgorithmConvergestoSecond-OrderStationarySolutionforNonconvexDistributedOptimizationOverNetworksMingyiHong1JasonD.Lee2MeisamRazaviyayn3AbstractthefollowingproblemInthiswork,west...
GradientDescentforSparseRank-OneMatrixCompletionforCrowd-SourcedAggregationofSparselyInteractingWorkersYaoMa1AlexOlshevsky2VenkateshSaligrama2CsabaSzepesvari3Abstractmatrixcollectstherandomlabelspr...
GradientdescentwithidentityinitializationefficientlylearnspositivedefinitelineartransformationsbydeepresidualnetworksPeterL.Bartlett1DavidP.Helmbold2PhilipM.Long3Abstract1.IntroductionWeanalyzealgo...
GradientDescentLearnsOne-hidden-layerCNN:Don’tbeAfraidofSpuriousLocalMinimaSimonS.Du1JasonD.Lee2YuandongTian3Barnaba´sPo´czos1AartiSingh1Abstractfully.WhysuchsimplemethodsinlearningDCNNissuc-ces...
GlobalConvergenceofPolicyGradientMethodsfortheLinearQuadraticRegulatorMaryamFazel1RongGe2ShamM.Kakade1MehranMesbahi1Abstract2016)andAtarigameplaying(Mnihetal.,2015).Deepreinforcementlearning(DeepRL...
GradientCodingfromCyclicMDSCodesandExpanderGraphsNetanelRaviv1ItzhakTamo2RashishTandon3AlexandrosG.Dimakis4AbstractdistributedsynchronousGradientdescent.Gradientcodingisatechniqueforstragglermit-Th...
FunctionalGradientBoostingbasedonResidualNetworkPerceptionAtsushiNitanda12TaijiSuzuki12Abstract&Wolf(2016).TheypresentedthatResNetsareensembleofshallowermodelsusinganunraveledviewofResNets.Residual...
FindingInfluentialTrainingSamplesforGradientBoostedDecisionTreesBorisSharchilev12YuryUstinovsky3PavelSerdyukov2MaartendeRijke1Abstract(2)boostingdeveloper’strustinthemodel’sperformanceinscenarios...
Data-DependentStabilityofStochasticGradientDescentIljaKuzborskij1ChristophH.Lampert2Abstractticeonemightnotevenreachaminimum,yetneverthelessobservesexcellentperformance.Weestablishadata-dependentno...
ComputationalOptimalTransport:ComplexitybyAcceleratedGradientDescentIsBetterThanbySinkhorn’sAlgorithmPavelDvurechensky1AlexanderGasnikov234AlexeyKroshnin234Abstractclustering(Hoetal.,2017),textcla...
CompositeFunctionalGradientLearningofGenerativeAdversarialModelsRieJohnson1TongZhang2Abstractinaterealdatafromgenerateddata.Mathematically,GANThispaperfirstpresentsatheoryforgenerativesolvesthefoll...
Communication-ComputationEfficientGradientCodingMinYe1EmmanuelAbbe2Abstractnicationcost(Guptaetal.,2015;Alistarhetal.,2017;Wenetal.,2017).Thispaperdevelopscodingtechniquestoreducetherunningtimeofdi...
AsynchronousDecentralizedParallelStochasticGradientDescentXiangruLian1WeiZhang2CeZhang3JiLiu41AbstractFigure1.Centralizednetworkanddecentralizednetwork.MostcommonlyuseddistributedmachinelearningS-P...
AnInference-BasedPolicyGradientMethodforLearningOptionsMatthewJ.A.Smith1HerkeVanHoof2JoellePineau1Abstractatvariouslevelsofabstraction,itispossibletoinfer,learnandplanmuchmoreefficiently.Further,ab...
AcceleratingNaturalGradientwithHigher-OrderInvarianceYangSong1JiamingSong1StefanoErmon1Abstractetal.,2013),andfirst-ordermethodswillhavetroubleinmakingprogress.Thecurvature,however,dependsonhowAnap...
StochasticGradientMonomialGammaSamplerYizheZhang1ChangyouChen1ZheGan1RicardoHenao1LawrenceCarin1Abstractics,leadingtoHamiltonianMonteCarlo(HMC)(Duaneetal.,1987).AidedbyGradientinformation,HMCisable...
StochasticModifiedEquationsandAdaptiveStochasticGradientAlgorithmsQianxiaoLi1ChengTai23WeinanE234Abstractwherek≥0and{γk}arei.i.duniformvariatestakingval-uesin{1,2,···,n}.Thestep-sizeηisthelea...