LearningDe-identifiedRepresentationsofProsodyfromRawAudioJackWeston1RaphaëlLenain1UdeepaMeepegama1EmilFristed1AbstractThephoneticproblemhasautomaticspeechrecognition(ASR)asitsobvioususe-case.Inrec...
GlobalRhythmStyleTransferWithoutTextTranscriptionsKaizhiQian12YangZhang12ShiyuChang12JinjunXiong2ChuangGan12DavidCox12MarkHasegawa-Johnson3Abstractwhichthespeakerexpressesthecontent.Therearetwomajo...
CHiVE:VaryingProsodyinSpeechSynthesiswithaLinguisticallyDrivenDynamicHierarchicalConditionalVariationalNetworkVincentWan1Chun-anChan1TomKenter1JakubVit2RobClark1AbstractμPredictedprosodicfeaturesT...
TowardsEnd-to-EndProsodyTransferforExpressiveSpeechSynthesiswithTacotronRJSkerry-Ryan1EricBattenberg1YingXiao1YuxuanWang1DaisyStanton1JoelShor1RonJ.Weiss1RobClark1RifA.Saurous1Abstractcanalsobespok...