I work at Bytedance AI Lab, Speech & Audio Team as a research scientist now in Singapore, leading a fundamental audio/talkingface research group.

I am now working on TTS, music generation, speech translation and audio-driven talking face generation research. If you are seeking any form of academic cooperation, please feel free to email me at ren.yi@bytedance.com.

I graduated from Chu Kochen Honors College, Zhejiang University (浙江大学竺可桢学院) with a bachelor’s degree and from the Department of Computer Science and Technology, Zhejiang University (浙江大学计算机科学与技术学院) with a master’s degree, advised by Zhou Zhao (赵洲). I also collaborate with Xu Tan (谭旭), Tao Qin (秦涛) and Tie-yan Liu (刘铁岩) from Microsoft Research Asia closely.

I won the Baidu Scholarship (10 candidates worldwide each year) and ByteDance Scholars Program (10 candidates worldwide each year) in 2020 and was selected as one of the top 100 AI Chinese new stars and AI Chinese New Star Outstanding Scholar (10 candidates worldwide each year).

My research interest includes speech synthesis, neural machine translation and automatic music generation. I have published 50+ papers at the top international AI conferences such as NeurIPS, ICML, ICLR, KDD.

To promote the communication among the Chinese ML & NLP community, we (along with other 11 young scholars worldwide) founded the MLNLP community in 2021. I am honored to be one of the chairs of the MLNLP committee.

If you like the template of this homepage, welcome to star and fork my open-sourced template version AcadHomepage .

🔥 News

  • 2023.05: 🎉 Five papers are accepted by ACL 2023
  • 2023.04: 🔥 We release AudioGPT (⭐️6k+)
  • 2023.04: 🎉 One paper (Make-an-Audio) is accepted by ICML 2023
  • 2023.01: DiffSinger was introduced in a very popular video (2000k+ views) in Bilibili!
  • 2023.01: Three papers are accepted by ICLR 2023!
  • 2023.01: I join Bytedance AI Lab, Speech & Audio Team as a research scientist in Singapore!
  • 2022.12: 🎉 My google scholar citations have exceeded 2000!
  • 2022.02: I release a modern and responsive academic personal homepage template. Welcome to STAR and FORK!

📝 Publications

🎙 Speech Synthesis

NeurIPS 2019

FastSpeech: Fast, Robust and Controllable Text to Speech
Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu


  • FastSpeech is the first fully parallel end-to-end speech synthesis model.
  • Academic Impact: This work is included by many famous speech synthesis open-source projects, such as ESPNet . Our work are promoted by more than 20 media and forums, such as 机器之心InfoQ.
  • Industry Impact: FastSpeech has been deployed in Microsoft Azure TTS service and supports 49 more languages with state-of-the-art AI quality. It was also shown as a text-to-speech system acceleration example in NVIDIA GTC2020.
ICLR 2021

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu


AAAI 2022

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Jinglin Liu, Chengxi Li, Yi Ren, Feiyang Chen, Zhou Zhao

NeurIPS 2021

👄 Talkingface Generation

📚 Machine Translation

🎼 Music Generation

🧑‍🎨 Generative Model


🎖 Honors and Awards

📖 Educations

  • 2019.06 - 2022.04, Master, Zhejiang University, Hangzhou.
  • 2015.09 - 2019.06, Undergraduate, Chu Kochen Honors College, Zhejiang Univeristy, Hangzhou.
  • 2012.09 - 2015.06, Luqiao Middle School, Taizhou.

💬 Invited Talks

  • 2022.02, Hosted MLNLP seminar | [Video]
  • 2021.06, Audio & Speech Synthesis, Huawei internal talk
  • 2021.03, Non-autoregressive Speech Synthesis, PaperWeekly & biendata | [video]
  • 2020.12, Non-autoregressive Speech Synthesis, Huawei Noah’s Ark Lab internal talk

💻 Internships