Achievements
Community
Corporate Projects
Exchange
Field Trips & Visits
Internship & Career
Undergraduate
Office
Research
Seminars & forums
Student Activities
T&M-DDP
Postgraduate
EVMT
Innovation
Entrepreneurship
Sustainability
Engineering
Environment
Air Quality
GBA
PublicPolicy
ENVR
PPOL
Teaching&Learning
Technology
Research and Technology
Greater Bay Area
IIM
Fintech
Research and Innovation
Prof. Wei XUE shared cutting-edge singing voice synthesis findings at the 18th National Conference on Man-machine Speech Communication and at the 3rd SpeechHome Conference on Speech Technology 2023
19/12/2023
Thumbnail
Prof. Wei XUE, Assistant Professor of the Division of Emerging Interdisciplinary Areas (EMIA), shared his insight of building and improving singing voice model at the 18th National Conference on Man-machine Speech Communication on Dec 8, 2023 and at the 3rd SpeechHome Conference on Speech Technology on Nov 18, 2023.
Prof. Wei XUE, Assistant Professor of the Division of Emerging Interdisciplinary Areas (EMIA), shared his insight of building and improving singing voice model at the 18th National Conference on Man-machine Speech Communication on Dec 8, 2023 and at the 3rd SpeechHome Conference on Speech Technology on Nov 18, 2023.

#researchwithoutboundaries

 

Prof. Wei XUE, Assistant Professor of the Division of Emerging Interdisciplinary Areas (EMIA), shared his insight of building and improving singing voice model at the 18th National Conference on Man-machine Speech Communication on Dec 8, 2023 and at the 3rd SpeechHome Conference on Speech Technology on Nov 18, 2023.

 

National Conference on Man-Machine Speech (NCMMSC) is an important stage for domestic experts, scholars and scientific researchers in the field of speech to exchange the latest research results and promote the continuous progress of research and development in this field.

 

Titled "Building the Singing Voice Foundation Model", Prof. XUE shared the research results by his team of constructing a large model of singing foundation to realize cross-gender, language, range, zero-resource, and fast-generation song synthesis. Unlike traditional AI singers that require hours of training data and a fixed repertoire, this model can support lyrics and tune modification, and can achieve the effect of singing any new song using only a few tens of seconds of data to achieve song synthesis instead of simple conversion.

 

The SpeechHome Conference on Speech Technology aims to promote voice technology exchanges between industry, academia and research institutes, gain insight into future technological innovation trends, and promote the development of intelligent voice technology in cutting-edge and open-source fields.

 

In this conference, Prof. XUE identified the existing problems of vocal synthesis, including extreme lack of labelled data, high cost of fine labelling and limited timbre. He further introduced a “High-speed, high-quality, zero-resource vocal synthesis”. With supporting technologies including CoMoSpeech and ZSinger, the diffusion model-based vocal synthesis method can be truly deployable in real-time for industrial-grade applications and allow modeling and lyric/melody control of arbitrary human timbre without labeling data.


Prof. Wei XUE, Assistant Professor of the Division of Emerging Interdisciplinary Areas (EMIA), shared his insight of building and improving singing voice model at the 18th National Conference on Man-machine Speech Communication on Dec 8, 2023 and at the 3rd SpeechHome Conference on Speech Technology on Nov 18, 2023.
SHARE
TAGS
Innovation
Technology
Research
Seminars & forums