ExtrapolationforLarge-batchTraininginDeepLearningTaoLin1LingjingKong1SebastianU.Stich1MartinJaggi1Abstract2017)isusedtoreducethegradientcomputationroundssoastoacceleratethetraining.However,inpracti...