OnlineLimitedMemoryNeural-LinearBanditswithLikelihoodMatchingOfirNabati1TomZahavy12ShieMannor13Abstractplorationduringtherepresentationlearningphaseisstillanopenproblem.The-greedypolicy(Langford&Zh...
OnlineLearningwithOptimismandDelayGenevieveFlaspohler12FrancescoOrabona3JudahCohen4SoukaynaMouatadid5MirunaOprescu6PauloOrenstein7LesterMackey6Abstractialonlinelearningalgorithmsproviderobustperfor...
RecomposingtheReinforcementLearningBuildingBlockswithHypernetworksEladSarafian1ShaiKeynan1SaritKraus1AbstractResBlockmetavariablePrimarynetLinearBlock256ResBlockTheReinforcementLearning(RL)building...
RecoveringAESKeyswithaDeepColdBootAttackItamarZimerman1EliyaNachmani12LiorWolf1Abstractkeybyexploitingthememoryleakageandtheredundancyofthekeyexpansionfunctionusedbytheencryptionmethod.Coldbootatta...
ReasoningOverVirtualKnowledgeBaseswithOpenPredicateRelationsHaitianSun1PatVerga2BhuwanDhingra2RuslanSalakhutdinov1WilliamW.Cohen2AbstractAnalternativetoextractinginformationtoaugmentexistingKBsreli...
RandomizedExplorationforReinforcementLearningwithGeneralValueFunctionApproximationHaqueIshfaq12QiwenCui3VietNguyen12AlexAyoub4ZhuoranYang5ZhaoranWang6DoinaPrecup127LinF.Yang8Abstractwhengeneralfunc...
OnReward-FreeRLwithKernelandNeuralFunctionApproximations:Single-AgentMDPandMarkovGameShuangQiu1JiepingYe1ZhaoranWang2ZhuoranYang3Abstractislargeandfunctionapproximatorssuchasneuralnetworksareemploy...
OnReinforcementLearningwithAdversarialCorruptionandItsApplicationtoBlockMDPTianhaoWu12YunchangYang3SimonS.Du4LiweiWang35Abstractisvulnerabletocorrupteddatastemmingfrommaliciousentities(Huangetal.,2...
OnEnergy-BasedModelswithOverparametrizedShallowNeuralNetworksCarlesDomingo-Enrich1AlbertoBietti2EricVanden-Eijnden1JoanBruna12Abstracttionaltoexp{−f(x)}.Suchenergy-basedmodels(EBMs)originateinstat...
OfflineMeta-ReinforcementLearningwithAdvantageWeightingEricMitchell1RafaelRafailov1XueBinPeng2SergeyLevine2ChelseaFinn1Abstractofreinforcementlearningalgorithms,whenthegoalistoultimatelylearnmanyta...
OfflineReinforcementLearningwithFisherDivergenceCriticRegularizationIlyaKostrikov12JonathanTompson2RobFergus13OfirNachum2Abstractwheredeployinganewpolicytointeractwiththeliveen-vironmentisexpensive...
OfflineReinforcementLearningwithPseudometricLearningRobertDadashi1ShidehRezaeifar2NinoVieillard13Le´onardHussenot14OlivierPietquin1MatthieuGeist1Abstractthatgeneratedtheseexperiences(Pomerleau,199...
OfflineContextualBanditswithOverparameterizedModelsDavidBrandfonbrener1WilliamF.Whitney1RajeshRanganath1JoanBruna1AbstractIncontrast,thebestperformanceinmodernsupervisedlearningisoftenachievedbymas...
ObjectSegmentationwithoutLabelswithLarge-ScaleGenerativeModelsAndreyVoynov1StanislavMorozov1ArtemBabenko1Abstractmodelstoperformlabel-freeobjectsegmentation,wheregroundtruthpixel-levellabelsareexpe...
Multi-TaskReinforcementLearningwithContext-basedRepresentationsShagunSodhani1AmyZhang123JoellePineau123Abstractreinforcementlearning(MTRL)isapromisingapproachtotraineffectivereal-worldagents(Tanaka...
Multi-AgentTrainingbeyondZero-SumwithCorrelatedEquilibriumMeta-SolversLukeMarris12PaulMuller13MarcLanctot1KarlTuyls1ThoreGraepel12AbstractAvisetal.,2010;Harsanyi&Selten,1988).2Two-player,constant-s...
MonotonicRobustPolicyOptimizationwithModelDiscrepancyYuankunJiang1ChenglinLi2WenruiDai1JunniZou1HongkaiXiong2Abstractcontroltasks,e.g.,playingcomputergameswithhuman-levelperformance(Mnihetal.,2013;...
Model-TargetedPoisoningAttackswithProvableConvergenceFnuSuya1SaeedMahloujifar2AnshumanSuri1DavidEvans1YuanTian1AbstractMostworkonpoisoningattackshasconsideredoneoftwoextremalattackerobjectives:indi...
ModelingHierarchicalStructureswithContinuousRecursiveNeuralNetworksJishnuRayChowdhury1CorneliaCaragea1Abstractsomeofthesestructure-awaremethods(Shenetal.,2019a;Qianetal.,2020)alsoexhibitbettersyste...
Model-basedReinforcementLearningforContinuousControlwithPosteriorSamplingYingFan1YifeiMing1AbstractinRLhasbeenoneofthemainchallenges:theagentisexpectedtobalancebetweenexploringunseenstate-actionBal...