您的瀏覽器不支援JavaScript語法,網站的部份功能在JavaScript沒有啟用的狀態下無法正常使用。

中央研究院 資訊科學研究所

活動訊息

友善列印

列印可使用瀏覽器提供的(Ctrl+P)功能

學術演講

:::

Recent spoken dialogue system research by NYU Shanghai

  • 講者Yik-Cheung (Wilson) Tam 教授 (上海紐約大學)
    邀請人:古倫維
  • 時間2022-09-29 (Thu.) 13:30 ~ 15:30
  • 地點資訊所新館106演講廳
摘要
In this talk, we will present our research efforts in the context of the tenth Dialogue System Technology Challenge (DSTC10).
Firstly, we will present a problem of co-reference resolution in visual dialogue with rich multimodal inputs such as visual scene, multi-turn conversational text, and background knowledge base description of visual objects. We propose a UNITER-based model based on visual-and-textual transformer architecture. Our proposed approach achieves the 2nd place in the Situated Interactive Multimodal Conversational AI (SIMMC2.0) track of the offical DSTC10 evaluation organized by Meta AI.
In the second part of the talk, we will present a problem of automatic speech recognition (ASR) errors that deteriorates spoken language understanding (SLU) performance that requires accessing an external unstructured knowledge for question and answering. We propose a simulation approach to improve SLU robustness by randomly corrupting clean training text using an ASR error simulator, followed by self-correcting the errors and minimizing the target classification loss related to knowledge cluster selection in a joint manner. In the recent DSTC10 evaluation, our approach demonstrates significant improvement in knowledge selection, boosting Recall@1 from 0.495 to 0.7144 compared to the official baseline provided by Alexa AI. Lastly, we will close our talk with an ongoing research for the upcoming DSTC11 evaluation.