Trainingdata-efficientimagetransformers&DistillationthroughattentionHugoTouvron12MatthieuCord12MatthijsDouze1FranciscoMassa1AlexandreSablayrolles1Herve´Je´gou1Abstract⚗↑⚗Recently,neuralnetwork...
Zero-ShotKnowledgeDistillationfromaDecision-BasedBlack-BoxModelZiWang1Abstractresource-limiteddevicessuchasmobilephonesanddrones(Moskalenkoetal.,2018).Recently,alargenumberofap-Knowledgedistillatio...
ModelDistillationforRevenueOptimization:InterpretablePersonalizedPricingMaxBiggs1WeiSun2MarkusEttl2Abstractmainconcernsisthatthemostaccuratepredictionmodelsareoftennon-parametricfunctionswhichareco...
Data-FreeKnowledgeDistillationforHeterogeneousFederatedLearningZhuangdiZhu1JunyuanHong1JiayuZhou1Abstractprivacy-preservinglearningscheme,FLhasshownitspo-tentialtofacilitatereal-worldapplications,i...
Cross-modelBack-translatedDistillationforUnsupervisedMachineTranslationXuan-PhiNguyen12ShafiqJoty13Thanh-TungNguyen12WuKui2AiTiAw2Abstractmonolingualdatahasbeenactive.WhileRavi&Knight(2011)andKleme...
AStatisticalPerspectiveonDistillationAdityaKrishnaMenon1AnkitSinghRawat1SashankJ.Reddi1SeungyeonKim1SanjivKumar1Abstractetal.,2019).OnecommonlyacceptedintuitionfromHintonetal.(2015)isthattheteacher...
Feature-map-levelOnlineAdversarialKnowledgeDistillationInseopChung1SeongUkPark1JanghoKim1NojunKwak1Abstractsuchasmobileorembeddedsystems.Toovercomethisissue,manyresearcheshavebeenconductedtodevelop...
Dual-PathDistillation:AUnifiedFrameworktoImproveBlack-BoxAttacksYonggangZhang1YaLi2TongliangLiu3XinmeiTian1Abstract(Athalyeetal.,2018),evaluatemodelrobustness(Carlini&Wagner,2017;Moosavi-Dezfooliet...
TowardsUnderstandingKnowledgeDistillationMaryPhuong1ChristophH.Lampert1AbstractDistillation-basedtraininghasbeenconfirmedseveraltimes:theoptimizationstepisgenerallymorewell-behavedthanKnowledgedist...
RandomExpertDistillation:ImitationLearningviaExpertPolicySupportEstimationRuohanWang1CarloCiliberto1PierluigiV.Amadori1YiannisDemiris1Abstract2016).Despiteitssimplicity,BCtypicallyrequiresalargeamo...
Zero-ShotKnowledgeDistillationinDeepNetworksGauravKumarNayak1KondaReddyMopuri2VaisakhShaj3R.VenkateshBabu1AnirbanChakraborty1Abstractlentperformance,buttheycanbehugeandcomputationallyexpensive.Henc...
AdversarialDistillationofBayesianNeuralNetworkPosteriorsKuan-ChiehWang12PaulVicol12JamesLucas12LiGu1RogerGrosse12RichardZemel12AbstractUncertaintyisimportantinmanyscenarios.Forexam-ple,designersofa...