IE­EE Dis­tin­guis­hed In­dus­try Spea­ker Dr. To­mo­hi­ro Na­ka­ta­ni hält Gats­vor­trag an der Uni­ver­si­tät Pa­der­born

 |  Heinz Nixdorf InstitutNachrichtentechnik (NT) / Heinz Nixdorf Institut

Am 27. Februar freut sich die IEEE Signal Processing Society den Distinguished IEEE Speaker Dr. Tomohiro Nakatani für einen Gastvortrag an der Universität Paderborn begrüßen zu dürfen. Er wird einen Vortrag unter dem Titel “Enhancing Distant Automatic Speech Recognition via Model-Based Multi-Microphone Front-Ends” präsentieren. Die Präsentation findet im im Raum L3.204 um 11 Uhr an der Universität Paderborn statt. Tomohiro Nakatani ist Senior Distinguished Researcher bei den Communication Science Laboratories, NTT, Inc., Japan. Er war Mitglied des IEEE Signal Processing Society Audio and Acoustic Signal Processing Technical Committee (2009–2014) und des Speech and Language Processing Technical Committee (2016–2021). Im Jahr 2021 wurde er zum IEEE Fellow ernannt. 

Abstract: 

Distant Automatic Speech Recognition (DASR) refers to the task of recognizing speech captured by farfield microphones. It supports a wide range of applications, including the recognition of natural human conversations in everyday environments. A major challenge in DASR is maintaining high recognition accuracy in the presence of interfering signals such as background noise, reverberation, and overlapping speech.

This talk will provide an overview of modelbased multimicrophone frontend techniques developed to suppress interference in DASR. A key strength of this approach is its ability to decompose signals into individual components using physical and probabilistic signal models, without necessarily requiring prior training. This property enables strong adaptability to unknown and complex environments. Moreover, when combined with neural network approaches, this framework enables highly accurate frontend processing under adverse conditions.

Through challenging DASR scenarios, the talk will demonstrate how dereverberation, denoising, and source separation frontends can substantially enhance recognition performance.