[Paper Review] VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers October 02 2024 VALL-E 2
[Paper Review] VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment September 30 2024 VALL-E R
[Paper Review] Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling September 25 2024 VALL-E X
[Paper Review] Neural Codec Langauge Models are Zero-Shot Text to Speech Synthesizers September 23 2024 VALL-E
[Paper Review] Should you use a probabilistic duration model in TTS? Probably! Especially for spontaneous speech September 23 2024 TTS에서 probabilistic duration model이 효과적인 상황 탐구