TheImplicitRegularizationofStochasticGradientFlowforLeastSquaresAlnurAli1EdgarDobriban2RyanJ.Tibshirani3Abstractgeneratedbysequentialoptimizationalgorithmsmayserveascheapapproximationstothemoreexpe...
TheImplicitandExplicitRegularizationEffectsofDropoutColinWei1ShamKakade2TengyuMa1Abstractitsregularizationeffects:Wageretal.(2013);Helmbold&Long(2015);Cavazzaetal.(2017);Mianjyetal.(2018);Dropoutis...
TheCost-freeNatureofOptimallyTuningTikhonovRegularizersandOtherOrderedSmoothersPierreC.Bellec1DanaYang2AbstractisdefinedasthesolutionofthequadraticprogramWeconsidertheproblemofselectingthebestes-w...
TheEffectofNaturalDistributionShiftonQuestionAnsweringModelsJohnMiller1KarlKrauth1BenjaminRecht1LudwigSchmidt1AbstractmerelytoobtainhighscoresontheSQuADleaderboard,butrathertogeneralizetonewexample...
TheComplexityofFindingStationaryPointswithStochasticGradientDescentYoelDrori1OhadShamir12Abstractisnottominimizef(x)overx,butrather∇f(x).Thisquestionoffindingstationarypointshasgainedmoreatten-Wes...
TemporalPhenotypingusingDeepPredictiveClusteringofDiseaseProgressionChangheeLee1MihaelavanderSchaar231Abstractunderstandingsuchheterogeneousdiseases.Thisallowsclinicianstoanticipatepatients’progno...
TailsofLipschitzTriangularFlowsPriyankJaini12IvanKobyzev3YaoliangYu12MarcusA.Brubaker34AbstractVanden-Eijnden,2010;Tabak&Turner,2013;Rezende&Mohamed,2015)andautoregressivemodels(Papamakar-Weinvesti...
SubspaceFittingMeetsRegression:TheEffectsofSupervisionandOrthonormalityConstraintsonDoubleDescentofGeneralizationErrorsYehudaDar1PaulMayer1LorenzoLuzi1RichardG.Baraniuk1AbstractFigure1.Thesupervisi...
Super-efficiencyofautomaticdifferentiationforfunctionsdefinedasaminimumPierreAblin1GabrielPeyré1ThomasMoreau2Abstractsparsecode,andListheLassocost(Mairaletal.,2010).Inthiscase,measurestheabilityof...
StructuralLanguageModelsofCodeUriAlon1RoySadaka1OmerLevy23EranYahav1Abstractvlinetal.,2017;Ellisetal.,2019),whileotherrecentap-proachesgeneratecodeingenerallanguageslikeJavaandWeaddresstheproblemof...
StochasticRank:GlobalOptimizationofScale-FreeDiscreteFunctionsAlekseiUstimenko1LiudmilaProkhorenkova123AbstractTodealwiththediscretestructureofarankingloss,onecanusesomesmoothapproximation,whichise...
SoftSort:AContinuousRelaxationfortheargsortOperatorSebastianPrillo1JulianMartinEisenschlos2Abstracttion.Becauseofthis,operatorssuchasthesoftmaxareWhilesortingisanimportantprocedureincom-ubiquitousi...
Self-ConcordantAnalysisofFrank-WolfeAlgorithmsPavelDvurechensky12PetrOstroukhov3KamilSafin3ShimritShtern4MathiasStaudigl5Abstractcondition(Bauschkeetal.,2017;Luetal.,2018),thecorner-stoneassumption...
ScalableIdentificationofPartiallyObservedSystemswithCertainty-EquivalentEMKunalMenda1JeandeBecdelie`vre1JayeshK.Gupta1IlanKroo1MykelJ.Kochenderfer1ZacharyManchester1Abstract···xtxt+1···System...
RevisitingFundamentalsofExperienceReplayWilliamFedus12PrajitRamachandran1RishabhAgarwal1YoshuaBengio23HugoLarochelle14MarkRowland5WillDabney5Abstracttounderstandtheinterplayoflearningalgorithmsandd...
RethinkingBias-VarianceTrade-offforGeneralizationofNeuralNetworksZitongYang1YaodongYu1ChongYou1JacobSteinhardt12YiMa1Abstractfromamismatchbetweenthemodelclassandtheunder-lyingdatadistribution,andis...
ReliableEvaluationofAdversarialRobustnesswithanEnsembleofDiverseParameter-freeAttacksFrancescoCroce1MatthiasHein1Abstractvariationsareusingotherlosses(Zhangetal.,2019b)andboostrobustnessviagenerati...
RecoveryofSparseSignalsfromaMixtureofLinearSamplesAryaMazumdar1SoumyabrataPal1AbstractWewillrefertothevaluesreturnedbytheoraclegiventhesequeriesassamples.Mixtureoflinearregressionsisapopularlearnin...
RandomizedSmoothingofAllShapesandSizesGregYang1TonyDuan12J.EdwardHu12HadiSalman1IlyaRazenshteyn1JerryLi1Abstractheuristicdefensesthatarerobusttospecificclassesofper-turbations,butmanywouldlaterbebr...