ObliviousSketching-basedCentralPathMethodforLinearProgrammingZhaoSong1ZhengYu2AbstractConsidersolvingagenerallinearprograminstandardformminAx=b,x≥0cxofsizeA∈Rd×nwithoutredundantInthiswork,weprop...
FindingtheStochasticShortestPathwithLowRegret:TheAdversarialCostandUnknownTransitionCaseLiyuChen1HaipengLuo1Abstractendwithinafixednumberofstepsisextensivelystudiedinrecentyears(oftenknownasepisodi...
Q-valuePathDecompositionforDeepMultiagentReinforcementLearningYaodongYang1JianyeHao12GuangyongChen3HongyaoTang1YingfengChen4YujingHu4ChangjieFan4ZhongyuWei5Abstract1.IntroductionRecently,deepmultia...
Near-optimalRegretBoundsforStochasticShortestPathAlonCohen1HaimKaplan12YishayMansour12AvivRosenberg2AbstractThefocusofthisworkisonregretminimizationinSSP.Itbuildsonextensiveliteratureontheoreticala...
PathConsistencyLearninginTsallisEntropyRegularizedMDPsOfirNachum1YinlamChow2MohamamdGhavamzadeh2Abstractmodelisknown,theoptimalpolicyisthesolutionofthenon-linearBellmanoptimalityequations(Bellman,1...
ProbabilisticPathHamiltonianMonteCarloVuDinh1ArmanBilge12ChengZhang1FrederickA.MatsenIV1Abstractdatestep,andthushasprovedtobemoreeffectivethanstandardMCMCmethodsinavarietyofapplications.TheHamilton...