DynamicalIsometryandaMeanFieldTheoryofRNNs:GatingEnablesSignalPropagationinRecurrentNeuralNetworksMinminChen1JeffreyPennington2SamuelS.Schoenholz2Abstracttion(Gravesetal.,2013)andrecommendationsyst...
DynamicalIsometryandaMeanFieldTheoryofCNNs:HowtoTrain10,000-LayerVanillaConvolutionalNeuralNetworksLechaoXiao12YasamanBahri12JaschaSohl-Dickstein1SamuelS.Schoenholz1JeffreyPennington1AbstractFigure...
DissipativityTheoryforAcceleratingStochasticVarianceReduction:AUnifiedAnalysisofSVRGandKatyushaUsingSemidefiniteProgramsBinHu1StephenWright1LaurentLessard1AbstractMonro,1951;Bottou&LeCun,2003).Rece...
CurriculumLearningbyTransferLearning:TheoryandExperimentswithDeepNetworksDaphnaWeinshall1GadCohen1DanAmir1Abstractforcementlearning(e.g.Gravesetal.,2017).Althoughitremainedforthemostpartinthefringe...
ARicherTheoryofConvexConstrainedOptimizationwithReducedProjectionsandImprovedRatesTianbaoYang1QihangLin1LijunZhang2Abstract1.IntroductionThispaperfocusesonconvexconstrainedopti-Inthispaper,weaimats...
DissipativityTheoryforNesterov’sAcceleratedMethodBinHu1LaurentLessard1Abstracttories.OnceaLyapunovfunctionisfound,onecanrelatetherateofdecreaseofthisinternalenergytotherateofInthispaper,weadaptthe...