EfficientlySolvingMDPswithStochasticMirrorDescentYujiaJin1AaronSidford1AbstractanMDPgivenonlyrestrictedaccesstothemodel.Inpar-ticular,weconsidertheproblemofcomputingan-optimalInthispaperwepresentau...
DoublyStochasticVariationalInferenceforNeuralProcesseswithHierarchicalLatentVariablesQiWang1HerkevanHoof1Abstractdecision-making(Gal&Ghahramani,2016).Neuralprocesses(NPs)constituteafamilyofvari-Fac...
ConvergenceofaStochasticGradientMethodwithMomentumforNon-SmoothNon-ConvexOptimizationVienV.Mai1MikaelJohansson1AbstractThisfunctionclassisveryrichandimportantinoptimiza-tion(Rockafellar,1982;Vial,1...
Communication-EfficientDistributedStochasticAUCMaximizationwithDeepNeuralNetworksZhishuaiGuo1MingruiLiu1ZhuoningYuan1LiShen2WeiLiu2TianbaoYang1Abstract1.IntroductionInthispaper,westudydistributedal...
CanStochasticZeroth-OrderFrank-WolfeMethodConvergeFasterforNon-ConvexProblems?HongchangGao1HengHuang12Abstractwhere⌦⇢Rddenotesaclosedconvexfeasibleset,eachcomponentfunctionfiissmoothandnon-convex...
AcceleratedStochasticGradient-freeandProjection-freeMethodsFeihuHuang12LueTao12SongcanChen12Abstract1.IntroductionInthepaper,weproposeaclassofacceleratedInthepaper,wefocusonsolvingthefollowingconst...
TheEffectofNetworkWidthonStochasticGradientDescentandGeneralization:anEmpiricalStudyDanielS.Park12JaschaSohl-Dickstein1QuocV.Le1SamuelL.Smith3AbstractWilsonetal.,2017;Sagunetal.,2017;Mandtetal.,201...
TheAnisotropicNoiseinStochasticGradientDescent:ItsBehaviorofEscapingfromSharpMinimaandRegularizationEffectsZhanxingZhu123JingfengWu1BingYu1LeiWu1JinwenMa1Abstract90Understandingthebehaviorofstochas...
SWALP:StochasticWeightAveraginginLow-PrecisionTrainingGuandaoYang1TianyiZhang1PolinaKirichenko1JunwenBai1AndrewGordonWilson1ChristopherDeSa1Abstractandaccumulategradientinformationinhigherprecision...
StochasticGradientPushforDistributedDeepLearningMahmoudAssran12NicolasLoizou13NicolasBallas1MikeRabbat1Abstracttributedtrainingofdeepnetworks(Goyaletal.,2017;Lietal.,2014).Workernodescomputelocalmi...
StochasticIterativeHardThresholdingforGraph-structuredSparsityOptimizationBaojianZhou1FengChen1YimingYing2Abstract2005;Yuan&Lin,2006)andmorestructurednormsbuiltoneitherdisjointoroverlappinggroupsof...
StochasticOptimizationforDCFunctionsandNon-smoothNon-convexRegularizerswithNon-asymptoticConvergenceYiXu1QiQi1QihangLin2RongJin3TianbaoYang1Abstractwhereg(·)andh(·)arereal-valuedlower-semicontinu...
StochasticDeepNetworksGwendolinedeBie1GabrielPeyre´21MarcoCuturi34Abstract2012)todealnowwithshapes(Wuetal.,2015a),sounds(Leeetal.,2009),texts(LeCunetal.,1998)orMachinelearningisincreasinglytargeti...
StochasticBlockmodelsmeetGraphNeuralNetworksNikhilMehta∗1LawrenceCarin1PiyushRai2Abstractsuchassocialandbiologicalnetworkanalysisandrecom-mendersystems.TheselatentstructureshelpdiscovertheStochast...
StochasticBeamsandWheretoFindThem:TheGumbel-Top-kTrickforSamplingSequencesWithoutReplacementWouterKool12HerkevanHoof1MaxWelling13AbstractextensionoftheGumbel-MaxtrickastheGumbel-Top-ktrick.Thewell-...
SimpleStochasticGradientMethodsforNon-SmoothNon-ConvexRegularizedOptimizationMichaelR.Metel1AkikoTakeda12Abstractwherefj(w)=F(w,ξj)andhasaLipschitzcontinuousgradient.OurworkfocusesonStochasticgrad...
SEVER:ARobustMeta-AlgorithmforStochasticOptimization1IliasDiakonikolas1GautamKamath2DanielM.Kane3JerryLi4JacobSteinhardt5AlistairStewart6Abstractbergetal.,2002;Lietal.,2008)requiringpainstakingman-...
Semi-CyclicStochasticGradientDescentHubertEichner1TomerKoren1H.BrendanMcMahan1NathanSrebro2KunalTalwar1AbstractSGD,typicallyafewhundreddevicesarechosenrandomlybytheservertoparticipate;critically,ho...
RiemannianadaptiveStochasticgradientalgorithmsonmatrixmanifoldsHiroyukiKasai1PratikJawanpuria2BamdevMishra2AbstractADAM(Kingma&Ba,2015),arguablythemostpopularadaptivegradientmethod,additionallyempl...
Rao-BlackwellizedStochasticGradientsforDiscreteDistributionsRunjingLiu1JeffreyRegier2NileshTripuraneni2MichaelI.Jordan12JonMcAuliffe13Abstractthefocusofthispaper,theexactgradientiscomputationallyin...