TemporalDifferenceLearningasGradientSplittingRuiLiu1AlexOlshevsky2AbstractTDusesDifferencesinpredictionsoversuccessivetimestepstodrivethelearningprocess,withthepredictionatTemporalDifferencelearnin...
PreferentialTemporalDifferenceLearningNishanthAnand12DoinaPrecup123AbstractTD-learningcanbeviewedasawaytoapproximatedy-namicprogrammingalgorithmsinMarkovianenviron-Temporal-Difference(TD)learningis...
ReducingSamplingErrorinBatchTemporalDifferenceLearningBrahmaS.Pavse1IshanDurugkar1JosiahP.Hanna23PeterStone14Abstractpolicy(Puterman&Shin,1978;Bertsekas,1987;Konda&Tsitsiklis,2000).Thesealgorithmsr...
PiecewiseLinearRegressionviaaDifferenceofConvexFunctionsAliSiahkamari1AdityaGangrade2BrianKulis1VenkateshSaligrama1Abstractproblemisalgorithmicallychallenginginhigh-dimensions,andmanyapproachestoth...
InterferenceandGeneralizationinTemporalDifferenceLearningEmmanuelBengio12JoellePineau1DoinaPrecup13Abstracttheinterferencebetweentasksinmulti-taskandcontinuallearning(Lopez-Paz&Ranzato,2017;Schaule...
Least-SquaresTemporalDifferenceLearningfortheLinearQuadraticRegulatorStephenTu1BenjaminRecht1Abstracttheoreticalunderstandingoftheissueisstillanopenques-tion.Amorerigorousfoundationcouldhelptodiffe...