Latest News

Published:2022.06.15

SIGS Graduate Stories | Song Xingchen's journey to making speech visible

Editor’s Note:

Another graduation season is upon us! As SIGS graduates are ready to start the next chapter of their lives, they have reflected back on their time here and shared their stories and experiences with us.

Name: Song Xingchen

Major: Computer Science & Technology

 ◆

Xingchen woke up one morning in the first year of high school realizing that he had lost hearing in both ears. He was later diagnosed with sensorineural deafness. Because no medicine was available for his condition at his age, he underwent cochlear implant surgery.

With the help and support of his friends and family, Xingchen was able to overcome the mental health and academic challenges that resulted from his condition. At school, he refused to stay behind, studying by himself and eventually catching up to the second-year coursework.

Although the cochlear implants allowed him to hear sound, Xingchen still had issues understanding. He struggled following lectures in class and conversations with classmates. From these challenges, the desire to find a way to “see” sound formed.

An unconventional research path

In 2015, Xingchen entered Dalian University of Technology to study computer science. Through his studies, he realized, he could make it easier for himself and other hearing-impaired people to navigate the hearing world.

Xingchen playing basketball, a hobby of his

“In my undergraduate years, I realized that speech recognition technology was the solution, so I chose it as my research direction. My goal is to make communication easier for people like myself who are also hearing impaired,” said Xingchen. After choosing his goal, Xingchen started research in speech recognition technology as an undergraduate.

After graduation, he was ready to take his dream to graduate school at Tsinghua SIGS. Xingchen’s advisor and mentor Wu Zhiyong from the SIGS Division of Information Science and Technology offered him a lot of support and understanding.

“Prof. Wu’s research direction is actually speech synthesis, but I wanted to do work on speech recognition. The research methodology was completely opposite. I choose the unconventional path and had to start with little foundation, but Prof. Wu was extremely supportive.” Xingchen is full of gratitude for Prof. Wu.

At international conferences held online, Xingchen presented his research, but fielding questions from foreign attendees was a challenge for his implants. Prof. Wu helped him to transcribe the questions so he could fully participate.

According to Prof. Wu, Xingchen’s unconventional choice was the mark of a trailblazer and leader. He opened a new direction for the team’s research and brought attention to the field of speech recognition to his classmates.

Realizing his dream

In his first year at SIGS, Xingchen gained a solid theoretical research foundation and published three papers at high-level conferences on speech processing, one at the International Conference on Acoustics, Speech, & Signal Processing (ICASSP) and two at INTERSPEECH, which were cited over 60 times.

In his second year, he began to bridge the gap between his theoretical understanding and practical application.

“The great part about the Master of Engineering at Tsinghua SIGS is that internships are a required part of the program. Over the course of a year and a half at my internship, I was able to take a big step closer to my dream,” Xingchen said.

From 2021 to 2022, Xingchen worked on a project at Horizon Robotics with WeNet’s speech open-source community. He was the main developer of the open-source library for Chinese inverse text normalization called WeTextProcessing. This achievement was an important contribution to the speech recognition open-source community and has already been put online at Horizon Robotics.

Xingchen was also involved in the creation of WeNet’s speech recognition open-source toolkit, contributing over 10,000 lines of code (fourth among all the project’s developers). The toolkit, from data to framework, is a full-stack speech recognition open-source system and has surpassed commercial systems in performance. It’s being used in various industries for self-developed speech recognition services and has minimized the cost of developing these services for small and medium-scale businesses. Over 100 companies are using the toolkit.

“I’ve succeeded at making speech visible!” Xingchen said proudly. In his thesis acknowledgments, Xingchen also wrote, “As someone who is hearing impaired, speech recognition has not only become an indispensable 'body part' in my life like the cochlear implant but has also become my lifelong pursuit." After graduation, Xingchen will bring his dream to Horizon Robotics full-time. 

                                                                                                                                                                           

Edited by Alena Shish & Yuan Yang

Photos provided by interviewee

Cover photo by Wu Chen