ScalingPropertiesofDeepResidualNetworksAlain–SamCohen1RamaCont2AlainRossier21RenyuanXu2Abstractwhereh(kL)isthehiddenstateatlayerk=0,...,L,h(0L)=x∈Rdtheinput,h(LL)∈Rdtheoutput,σ:R→Risanon-Resid...
BayesianAlgorithmExecution:EstimatingComputablePropertiesofBlack-boxFunctionsUsingMutualInformationWillieNeiswanger1KeAlexanderWang1StefanoErmon1AbstracttackledbyBayesianoptimizationmethods(Shahria...
UniquePropertiesofFlatMinimainDeepNetworksRotemMulayoff1TomerMichaeli1Abstractmodels(Gunasekaretal.,2018b)anddeepnonlinearnet-workswithhomogeneousactivationfunctions(Lyu&Li,Itiswellknownthat(stocha...
OntheTheoreticalPropertiesoftheNetworkJackknifeQiaohuiLin1RobertLunde1PurnamritaSarkar1Abstractetal.,2017;Gonenetal.,2010;Kallaugheretal.,2019).However,comparativelylittleattentionhasbeenpaidtoWest...
LocalConvergencePropertiesofSAGA/Prox-SVRGandAccelerationClaricePoon1JingweiLiang1Carola-BibianeScho¨nlieb1Abstractcientlyoutputsolutionswhichtakeacertainstructure;seeforinstance(Liangetal.,2017)f...
TheoreticalPropertiesforNeuralNetworkswithWeightMatricesofLowDisplacementRankLiangZhao1SiyuLiao1YanzhiWang2ZheLi2JianTang2BoYuan1AbstractFigure1.ExamplesofcommonlyusedLDR(structured)matri-ces,i.e.,...