Speech Processing/Security

Replay Attack Diagram


Audio adversarial attack; Replay attack; Speech processing system anti-spoofing


With the fast-growing number of users and usage scenarios, the security problem of speech processing systems (e.g., speaker verification systems, intelligent personal assistant systems) becomes a new concern. Recent work has found that speech processing systems are vulnerable to multiple types of attacks such as replay attack, audio adversarial attack, and ultrasound attack. For example, a small but carefully designed noise, when added to a normal speech signal, can make automatic speech recognition systems produce arbitrary wrong transcriptions. These attacks are demonstrated to be extremely threatening in experimental and realistic settings, even worse, the attacker can easily bypass conventional defense models by doing an adversarial drift. Therefore, we focus on study a robust and universal defense strategy that is able to protect the speech processing system from a set of attacking technologies and their variants. 

Source Code

Baseline model for speech processing system anti-spoofing


Remasc Dataset 


  • Yuan Gong, Jian Yang, Jacob Huber, Mitchell MacKnight, and Christian Poellabauer, "ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems", Proceedings of the 20th Annual Conference of the International Speech Communication Association (INTERSPEECH), Graz, Austria, September 2019 (ISCA Best Student Paper Award Nomination). 
  • Yuan Gong, Boyang Li, Christian Poellabauer, and Yiyu Shi, "Real-time Adversarial Attacks", Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI), Macao, China, August 2019. 
  • Yuan Gong and Christian Poellabauer, "Crafting Adversarial Examples for Speech Paralinguistics Applications", Proceedings of the DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security (DYNAMICS) Workshop, San Juan, Puerto Rico, December 2018.
  • Yuan Gong and Christian Poellabauer, "Protecting Voice Controlled Systems Using Sound Source Identification Based on Acoustic Cues", Proceedings of the 27th International Conference on Computer Communications and Networks (ICCCN), Hangzhou, China, July-August 2018.

Project Members

  • Dr. Christian Poellabauer
  • Yuan Gong
  • Jian Yang
  • Jacob Huber
  • Mitchell MacKnight