youOnlySample(Almost)Once:LinearCostSelf-AttentionViaBernoulliSamplingZhanpengZeng1YunyangXiong1SathyaN.Ravi2ShaileshAcharya3GlennFung3VikasSingh1Abstractlanguageinference(Devlinetal.,2019)andparap...
IsSpace-TimeAttentionAllyouNeedforVideoUnderstanding?GedasBertasius1HengWang1LorenzoTorresani12AbstractVideounderstandingsharesseveralhigh-levelsimilaritieswithNLP.Firstofall,videosandsentencesareb...
Attentionisnotallyouneed:pureattentionlosesrankdoublyexponentiallywithdepthYiheDong1Jean-BaptisteCordonnier2AndreasLoukas3Abstractattentionlayers.Surprisingly,wefindthatpureself-attentionnetworks(S...