Parameter-freeLocallyAcceleratedConditionalGradientsAlejandroCarderera1JelenaDiakonikolas2CheukYin(Eric)Lin2SebastianPokutta3Abstracttions(Jaggi,2013;Garber,2016;Hazan&Luo,2016;Braunetal.,2017;2019...
OnProximalPolicyOptimization’sHeavy-tailedGradientsSaurabhGarg1JoshuaZhanson2EmilioParisotto1AdarshPrasad1J.ZicoKolter2ZacharyC.Lipton1SivaramanBalakrishnan3RuslanSalakhutdinov1PradeepRavikumar1Ab...
MarginalizedStochasticNaturalGradientsforBlack-BoxVariationalInferenceGengJi12DeboraSujono2ErikB.Sudderth2Abstractupdatesingle(orsmallblocksof)variationalparameters,whileholdingallothersfixedtothei...
StatisticallyEfficientOff-PolicyPolicyGradientsNathanKallus1MasatoshiUehara2AbstractTable1.Comparisonofoff-policypolicygradientestimators.Here,f=Θ(g)means0<liminff/g≤limsupf/g<∞(nottoPolicygradi...
Min-MaxOptimizationwithoutGradients:ConvergenceandApplicationstoBlack-BoxEvasionandPoisoningAttacksSijiaLiu1SongtaoLu2XiangyiChen3YaoFeng4KaidiXu5AbdullahAl-Dujaili6MingyiHong3Una-MayO’Reilly71Abs...
EstimatingQ(s,s)withDeepDeterministicDynamicsGradientsAshleyD.Edwards1HimanshuSahni2RosanneLiu13JaneHung1AnkitJain1RuiWang1AdrienEcoffet1ThomasMiconi1CharlesIsbell2JasonYosinski13AbstractActionat+1...
BoostingFrank-WolfebyChasingGradientsCyrilleW.Combettes1SebastianPokutta23AbstractifC={X∈Rm×nXnucτ}isanuclearnorm-ball,aprojectionontoCrequirescomputinganSVD,whichhasTheFrank-Wolfealgorithmhasbe...
BatchReinforcementLearningwithHyperparameterGradientsByung-JunLee1JongminLee1PeterVrancx2DonghoKim2Kee-EungKim23Abstractrealenvironment.However,thisapproachrequiresalotofhumaneffortincludingdomaine...
Rao-BlackwellizedStochasticGradientsforDiscreteDistributionsRunjingLiu1JeffreyRegier2NileshTripuraneni2MichaelI.Jordan12JonMcAuliffe13Abstractthefocusofthispaper,theexactgradientiscomputationallyin...
BlendedConditionalGradients:TheUnconditioningofConditionalGradientsGa´borBraun1SebastianPokutta1DanTu1StephenWright2AbstractCGemploysalinearprogramming(LP)oracletominimizealinearfunctionoverthepol...
StabilizingGradientsforDeepNeuralNetworksviaEfficientSVDParameterizationJiongZhang1QiLei1InderjitS.Dhillon12Abstract1.IntroductionVanishingandexplodingGradientsaretwooftheDeepneuralnetworkshaveachi...
OnAccelerationwithNoise-CorruptedGradientsMichaelB.Cohen1JelenaDiakonikolas2LorenzoOrecchia2AbstractAccelerationisinterestingbecauseityieldsfasteralgorithmsthanclassicalsteepest-descentalgorithms,o...
ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamplesAnishAthalye1NicholasCarlini2DavidWagner2Abstractapparentrobustnessagainstiterativeoptimizationattacks:obfusca...
FourierPolicyGradientsMatthewFellows1KamilCiosek1ShimonWhiteson1AbstractUntilrecently,policygradientmethodswereeitherrestrictedtodeterministicpolicies(Silveretal.,2014)orsufferedfromWeproposeanewwa...
EscapingSaddleswithStochasticGradientsHadiDaneshmand1JonasKohler1AurelienLucchi1ThomasHofmann1Abstractthroughaconservativestepsize(Nesterov,2013)orviaexplicitvariance-reductiontechniques(Johnson&Zh...
DRACO:Byzantine-resilientDistributedTrainingviaRedundantGradientsLingjiaoChen1HongyiWang1ZacharyCharles1DimitrisPapailiopoulos1Abstractnately,evenasingleadversarialnodeinadistributedsetupcanintrodu...
UnderstandingSyntheticGradientsandDecoupledNeuralInterfacesWojciechMarianCzarnecki1GrzegorzSwirszcz1MaxJaderberg1SimonOsindero1OriolVinyals1KorayKavukcuoglu1AbstractLLWhentrainingneuralnetworks,the...
TheShatteredGradientsProblem:Ifresnetsaretheanswer,thenwhatisthequestion?DavidBalduzzi1MarcusFrean1LennoxLeary1JPLewis12KurtWan-DuoMa1BrianMcWilliams3AbstractHeetal.,2015)withbatchnormalization(Iof...
ImprovingStochasticPolicyGradientsinContinuousControlwithDeepReinforcementLearningusingtheBetaDistributionPo-WeiChou1DanielMaturana1SebastianScherer1AbstractFigure1.Anexampleofcontinuouscontrolwith...
ˆABABBSBMA→BBSBMA→BDecoupledNeuralInterfacesusingShyAn→BtheticAGBˆrAaBdientsˆABAˆhA→BABAMaxJaderberg1WojciechMarianCzarnecki1SimonOsindero1OriolVinyals1AlexGraves1DavidSilver1KorayKavukcuog...