TwoHeadsareBetterThanOne:Hypergraph-EnhancedGraphReasoningforVisualEventRatiocinationWenboZheng12LanYan23ChaoGou4Fei-YueWang2Abstract1.IntroductionEvenwithastillimage,humanscanratiocinateInaclassro...
Trainsimultaneously,generalizeBetter:Stabilityofgradient-basedminimaxlearnersFarzanFarnia1AsumanOzdaglar1Abstract2014)andadversarialtraining(Madryetal.,2017)haveachievedgreatsuccessoverawidearrayof...
TowardsBetterRobustGeneralizationwithShiftConsistencyRegularizationShufeiZhang12ZhuangQian12KaizhuHuang1QiufengWang1RuiZhang3XinpingYi2Abstract2015;Heetal.,2016;Miyatoetal.,2017;Zagoruyko&Komodakis...
TowardsBetterLaplacianRepresentationinReinforcementLearningwithGeneralizedGraphDrawingKaixinWang1KuangqiZhou1QixinZhang2JieShao3BryanHooi1JiashiFeng1AbstractFigure1.VisualizationofenvironmentandLap...
TowardBetterGeneralizationBoundswithLocallyElasticStabilityZhunDeng1HangfengHe2WeijieJ.Su3Abstractvarietyofapproachesfromstatisticallearningtheory(Vap-nik,1979;2013;Bartlett&Mendelson,2002;Bousquet...
BetterTrainingusingWeight-ConstrainedStochasticDynamicsBenedictLeimkuhler1TiffanyVlaar1Timothe´ePouchon1AmosStorkey2AbstractCurrentapproachestoenhancethegeneralizationperfor-manceofoverparameteriz...
ImprovingTransformerOptimizationThroughBetterInitializationXiaoShiHuang12FelipePe´rez1JimmyBa32MaksimsVolkovs1Abstractetal.,2019;Sunetal.,2019).Despitethebroadapplications,optimizationintheTransfo...
BetterDepth-WidthTrade-offsforNeuralNetworksthroughthelensofDynamicalSystemsVaggosChatziafratis1SaiGaneshNagarajan2IoannisPanageas2Abstractunderstandthatthenatureofcomputationdonebydeepandshallowne...
ManifoldMixup:BetterRepresentationsbyInterpolatingHiddenStatesVikasVerma12AlexLamb2ChristopherBeckham2AmirNajafi3IoannisMitliagkas2DavidLopez-Paz4YoshuaBengio2AbstractThisisaworryingprospect,sinced...
BettergeneralizationwithlessdatausingrobustgradientdescentMatthewJ.Holland1KazushiIkeda2Abstractthedefactostandardlearningstrategyfortacklingmostma-chinelearningproblems(Kearns&Schapire,1994;Bartle...
ABetterk-means++AlgorithmviaLocalSearchSilvioLattanzi1ChristianSohler1AbstractThek-means++seedingalgorithm(Arthur&Vassilvitskii,2007)isasimplewaytoimproveLloyd’salgorithm.TheInthispaper,wedevelopa...
WhydoLargerModelsGeneralizeBetter?ATheoreticalPerspectiveviatheXORProblemAlonBrutzkus1AmirGloberson1AbstractwithN2>N1neuronsachievezerotrainingerror,butthelargernetworkhasBettertesterror.Thissomewh...
TighterVariationalBoundsareNotNecessarilyBetterTomRainforth1AdamR.Kosiorek12TuanAnhLe2ChrisJ.Maddison1MaximilianIgl2FrankWood3YeeWhyeTeh1Abstract&Kamp,1988;Hinton&Zemel,1994;Gregoretal.,2016;Chenet...
WhyisPosteriorSamplingBetterthanOptimismforReinforcementLearning?IanOsband12BenjaminVanRoy1Abstractmateoffuturevalueandselectstheactionwiththegreatestestimate.Ifaselectedactionisnotnear-optimal,the...
SequencetoBetterSequence:ContinuousRevisionofCombinatorialStructuresJonasMueller1DavidGifford1TommiJaakkola1Abstractthosewhichappearrealistic).Forexample:arandomse-quenceofwordswillalmostneverforma...
LearningGradientDescent:BetterGeneralizationandLongerHorizonsKaifengLv1ShunhuaJiang1JianLi1Abstract1.1.ExistingWorkTrainingdeepneuralnetworksisahighlynon-Toaddresstheaboveissue,apromisingapproachis...
ImprovingViterbiisHard:BetterRuntimesImplyFasterCliqueAlgorithmsArtursBackurs1ChristosTzamos1AbstractO(Tn2)timeforanyHMMwithnstatesandanobserva-tionsequenceoflengthT.ThisalgorithmisknownastheThecla...