Trainsimultaneously,Generalizebetter:Stabilityofgradient-basedminimaxlearnersFarzanFarnia1AsumanOzdaglar1Abstract2014)andadversarialtraining(Madryetal.,2017)haveachievedgreatsuccessoverawidearrayof...
LearningtoGeneralizefromSparseandUnderspecifiedRewardsRishabhAgarwal1ChenLiang1DaleSchuurmans12MohammadNorouzi1AbstractRankNationGoldSilverx=“Whichnationwonthemost1USA1012silvermedal?”Weconsidert...
DoImageNetClassifiersGeneralizetoImageNet?BenjaminRecht⇤1RebeccaRoelofs1LudwigSchmidt1VaishaalShankar1AbstractConventionalwisdomsuggeststhatsuchdropsarisebecausethemodelshavebeenadaptedtothespeci...
WhydoLargerModelsGeneralizeBetter?ATheoreticalPerspectiveviatheXORProblemAlonBrutzkus1AmirGloberson1AbstractwithN2>N1neuronsachievezerotrainingerror,butthelargernetworkhasbettertesterror.Thissomewh...
SharpMinimaCanGeneralizeForDeepNetsLaurentDinh1RazvanPascanu2SamyBengio3YoshuaBengio14Abstractapproximatecertainfunctions(e.g.Montufaretal.,2014;Raghuetal.,2016).Otherworks(e.gDauphinetal.,2014;Des...
LearnedOptimizersthatScaleandGeneralizeOlgaWichrowska1NiruMaheswaranathan23MatthewW.Hoffman4SergioGo´mezColmenarejo4MishaDenil4NandodeFreitas4JaschaSohl-Dickstein1Abstractlearningshowthat,givensuf...