Run-Sort-ReRun:EscapingBatchSizeLimitationsinSlicedWassersteinGenerativeModelsJose´Lezama1WeiChen2QiangQiu2Abstract2017;Lietal.,2017;Mrouehetal.,2017;Heuseletal.,2017;Deshpandeetal.,2018).However,...
GuaranteesforTuningtheStepSizeusingaLearning-to-LearnApproachXiangWang1ShuaiYuan1ChenweiWu1RongGe1Abstractrinetal.(2015)consideredtheideaoftuningtheseparame-tersbyoptimization—thatis,considerameta...
FromLocalStructurestoSizeGeneralizationinGraphNeuralNetworksGiladYehudai1EthanFetaya2EliMeirom1GalChechik12HaggaiMaron1AbstractFigure1.WestudytheabilityofGNNstogeneralizefromsmalltolargegraphs,focu...
TrainLarge,ThenCompress:RethinkingModelSizeforEfficientTrainingandInferenceofTransformersZhuohanLi1EricWallace1ShengShen1KevinLin1KurtKeutzer1DanKlein1JosephE.Gonzalez1AbstractCommonTrainSmallStopT...
SampleAmplification:IncreasingDatasetSizeevenwhenLearningisImpossibleBrianAxelrod1ShivamGarg1VatsalSharan1GregoryValiant1AbstractordoesitsufficetohaveaccesstoasmallerdatasetofSizen<mdrawnfromD,andt...
OneSizeFitsAll:CanWeTrainOneDenoiserforAllNoiseLevels?AbhiramGnansambandam1StanleyH.Chan12Abstractarguablyuniversalforalllearning-basedestimators.Whensuchaproblemarises,themoststraight-forwardsolut...
History-GradientAidedBatchSizeAdaptationforVarianceReducedAlgorithmsKaiyiJi1ZheWang1BowenWeng1YiZhou2WeiZhang3YingbinLiang1AbstracthavebeenproposedtoreducethevarianceofSGD.Suchvariancereductiontech...