
中央研究院 資訊科學研究所






Syntax in Statistical Machine Translation

  • 講者劉群 教授 (Dublin City University (DCU))
  • 時間2015-07-14 (Tue.) 10:00 ~ 12:00
  • 地點資訊所新館106演講廳

It has been proofed that syntax-based statistical machine translation can produce better translation than phrase-based translation especially for language pairs with big structural differences.  In this talk, I will give an overall introduction of various syntax-based statistical translation models, including our recent progress on dependency-based translation models, and analysis their advantages and disadvantages. I will also explainseveral typs of syntax-based language models used in statistical machine translation, and demonstrate our work on integrating a large scale dependency language model into a syntax-based SMT system. Then I will show a new syntax-based machine translation evaluation metrics and its result in the WMT Metrics Shared Tasks.  Finally I will project some future works along this research direction.


Qun Liu is currently a Professor in Dublin City University (DCU) and the leader of Natural Language Processing Group in ADAPT Centre (formerly CNGL), Ireland.  He received his Master's degree in Computer Science from the Institute of Computing Technology at Chinese Academy of Sciences (ICT/CAS) in 1992, and his PhD degree in Computer Science from Peking University in 2004. Since 1992 he has been a Researcher and Professor in ICT/CAS, Beijing, where he was the Leader of the Natural Language Processing Research Group.   He moved to Ireland and joined CNGL/DCU in July 2012, while still working with the NLP group at ICT/CAS as an Adjunct Professor and Group Leader. His research interests focus on machine translation and natural language processing, especially on Chinese language processing, statistical machine translation models, approaches and evaluation methods. He led the development of the broadly used ICTCLAS open source Chinese word segmentation and POS tagging systems in 2002.  In recent years he and his research group mainly work on syntactic-based statistical machine translation models, context-aware translation methods and adaptation technologies for NLP.