I work at TikTok as a research scientist now in Singapore.

I am now working on audio-driven talking face generation, text-to-speech and music generation research. If you are seeking any form of academic cooperation, please feel free to email me at ren.yi@bytedance.com. We are hiring interns!

I graduated from Chu Kochen Honors College, Zhejiang University (浙江大学竺可桢学院) with a bachelor’s degree and from the Department of Computer Science and Technology, Zhejiang University (浙江大学计算机科学与技术学院) with a master’s degree, advised by Zhou Zhao (赵洲). I also collaborate with Xu Tan (谭旭), Tao Qin (秦涛) and Tie-yan Liu (刘铁岩) from Microsoft Research Asia closely.

I won the Baidu Scholarship (10 candidates worldwide each year) and ByteDance Scholars Program (10 candidates worldwide each year) in 2020 and was selected as one of the top 100 AI Chinese new stars and AI Chinese New Star Outstanding Scholar (10 candidates worldwide each year).

My research interest includes speech synthesis, neural machine translation and automatic music generation. I have published 50+ papers at the top international AI conferences such as NeurIPS, ICML, ICLR, KDD.

To promote the communication among the Chinese ML & NLP community, we (along with other 11 young scholars worldwide) founded the MLNLP community in 2021. I am honored to be one of the chairs of the MLNLP committee.

If you like the template of this homepage, welcome to star and fork my open-sourced template version AcadHomepage .

🔥 News

  • 2024.03: 🎉 Two papers are accepted by ICLR 2024
  • 2023.05: 🎉 Five papers are accepted by ACL 2023
  • 2023.01: DiffSinger was introduced in a very popular video (2000k+ views) in Bilibili!
  • 2023.01: I join TikTok as a speech research scientist in Singapore!
  • 2022.02: I release a modern and responsive academic personal homepage template. Welcome to STAR and FORK!

📝 Publications

🎙 Speech Synthesis

NeurIPS 2019
sym

FastSpeech: Fast, Robust and Controllable Text to Speech
Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

Project

  • FastSpeech is the first fully parallel end-to-end speech synthesis model.
  • Academic Impact: This work is included by many famous speech synthesis open-source projects, such as ESPNet . Our work are promoted by more than 20 media and forums, such as 机器之心InfoQ.
  • Industry Impact: FastSpeech has been deployed in Microsoft Azure TTS service and supports 49 more languages with state-of-the-art AI quality. It was also shown as a text-to-speech system acceleration example in NVIDIA GTC2020.
ICLR 2021
sym

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

Project

ICLR 2024
sym

Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis \ Ziyue Jiang, Jinglin Liu, Yi Ren, et al.

Project

  • This work has been deployed on many TikTok products.
  • Advandced zero-shot voice cloning model.
AAAI 2022
sym

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Jinglin Liu, Chengxi Li, Yi Ren, Feiyang Chen, Zhou Zhao

NeurIPS 2021
sym

👄 TalkingFace & Avatar

ICLR 2024
sym

Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis, Zhenhui Ye, Tianyun Zhong, Yi Ren, et al. (Spotlight) Project | Code

📚 Machine Translation

🎼 Music & Dance Generation

🧑‍🎨 Generative Model

Others

🎖 Honors and Awards

📖 Educations

  • 2019.06 - 2022.04, Master, Zhejiang University, Hangzhou.
  • 2015.09 - 2019.06, Undergraduate, Chu Kochen Honors College, Zhejiang Univeristy, Hangzhou.
  • 2012.09 - 2015.06, Luqiao Middle School, Taizhou.

💬 Invited Talks

  • 2022.02, Hosted MLNLP seminar | [Video]
  • 2021.06, Audio & Speech Synthesis, Huawei internal talk
  • 2021.03, Non-autoregressive Speech Synthesis, PaperWeekly & biendata | [video]
  • 2020.12, Non-autoregressive Speech Synthesis, Huawei Noah’s Ark Lab internal talk

💻 Internships