OntheGlobalConvergenceRatesofSoftmaxPolicyGradientMethodsJinchengMei12ChenjunXiao1CsabaSzepesva´ri31DaleSchuurmans21Abstracttheyguaranteemonotonicimprovementofthevalue.Asec-ondaryappealisthatpolic...
RevisitingtheSoftmaxBellmanOperator:NewBenefitsandNewPerspectiveZhaoSong1RonaldE.Parr1LawrenceCarin1Abstracttivatestheuseofexploratoryandpotentiallysub-optimalactionsduringlearning,andonecommonly-u...
BreakingtheSoftmaxBottleneckviaLearnableMonotonicPointwiseNon-linearitiesOctavian-EugenGanea1SylvainGelly2GaryBécigneul1AliakseiSeveryn3AbstractbyaSoftmaxfunctiontooutputaprobabilitydistributionov...
AdaptiveSampledSoftmaxwithKernelBasedSamplingGuyBlanc1SteffenRendle2Abstractmizationalgorithm,e.g.,stochasticgradientdescent,needstocomputethegradientswithrespecttotheloss.WhentheSoftmaxisthemostco...
EfficientSoftmaxapproximationforGPUsE´douardGrave1ArmandJoulin1MoustaphaCisse´1DavidGrangier1Herve´Je´gou1Abstractbyobjectivecriteriasuchasperplexity(ppl),whichdirectlymeasurestheabilityofthesy...
AnAlternativeSoftmaxOperatorforReinforcementLearningKavoshAsadi1MichaelL.Littman1AbstractAnidealSoftmaxoperatorisaparameterizedsetofoperatorsthat:ASoftmaxoperatorappliedtoasetofvaluesactssomewhatli...