Seminar

Recent spoken dialogue system research by NYU Shanghai

LecturerProf. Yik-Cheung (Wilson) Tam (NYU Shanghai)
Host: Lun-Wei Ku
Time2022-09-29 (Thu.) 13:30 ~ 15:30
LocationAuditorium 106 at IIS new Building

Abstract

In this talk, we will present our research efforts in the context of the tenth Dialogue System Technology Challenge (DSTC10).
Firstly, we will present a problem of co-reference resolution in visual dialogue with rich multimodal inputs such as visual scene, multi-turn conversational text, and background knowledge base description of visual objects. We propose a UNITER-based model based on visual-and-textual transformer architecture. Our proposed approach achieves the 2nd place in the Situated Interactive Multimodal Conversational AI (SIMMC2.0) track of the offical DSTC10 evaluation organized by Meta AI.
In the second part of the talk, we will present a problem of automatic speech recognition (ASR) errors that deteriorates spoken language understanding (SLU) performance that requires accessing an external unstructured knowledge for question and answering. We propose a simulation approach to improve SLU robustness by randomly corrupting clean training text using an ASR error simulator, followed by self-correcting the errors and minimizing the target classification loss related to knowledge cluster selection in a joint manner. In the recent DSTC10 evaluation, our approach demonstrates significant improvement in knowledge selection, boosting Recall@1 from 0.495 to 0.7144 compared to the official baseline provided by Alexa AI. Lastly, we will close our talk with an ongoing research for the upcoming DSTC11 evaluation.

Institute of Information Science, Academia Sinica

Events

Seminar

Recent spoken dialogue system research by NYU Shanghai

Abstract