Page 29 - profile2012.indd
P. 29

Research Laboratories  研究群



                                              語言與知識處理實驗室


 Natural Language and




 Knowledge Processing  Laboratory




 Research Faculty  Technical Faculty
 Research Faculty

 Wen-Lian Hsu  Fu Chang  Keh-Jiann Chen  Hsin-Min Wang  Der-Ming Juang
 Distinguished Research Fellow  Associate Research Fellow  Research Fellow  Associate Research Fellow  Assistant Research Engineer





 Group Profile
 We focus on problems concerning knowledge-based informa-  c) Pattern-based information extraction (IE)  in 2009 Music Information Re-
 tion processing, a process which is strongly motivated by the   Most pattern-based IE approaches kick off by manually pro-  trieval  Evaluation  eXchange
 over-flooding of information on the Internet. We shall work on   viding seed instances. We have proposed two mechanisms   (MIREX2009).  We have devel-
 knowledge acquisition, utilization, and representation.  to remove human efforts at the beginning state. First, we   oped a novel query by multi-
 applied  a  semi-supervised  method that  can  take  a large   tags music search system.  Basic natural language
 1. Knowledge Acquisition  quantity of seed instances with diverse quality. Second, we
 proposed a weakly-supervised approach for extracting in-  c) Chinese  question answering   understanding is to identify
 Our focus is on strategies and methodologies of automating   stances of semantic classes, which uses a compression mod-  system
 knowledge acquisition processes.   el to assess the contextual evidence of its extraction.   We integrated several Chinese   person, time, location,
             NLP techniques to construct                                       artifact, and event in a
 a) Construction of linguistic knowledge bases  2. Knowledge Utilization  a  Chinese  factoid  QA  system,
 In the past twenty some years, we have developed an in-  which won the first place in NTCIR-5 and NTCIR-6. In the future, we will   sentence, which is especially
 frastructure for Chinese language processing that includes   Our Chinese input system, GOING, is used by over one million   extend the system to answer “how” and “what” types of questions.
 part-of-speech tagged corpus, tree-banks, Chinese lexical   people in Taiwan. Our knowledge representation kernel, Info-  important in Chinese
 databases, Chinese grammars, InfoMap, word identification   Map, has been applied to a wide variety of application systems.   d) Named entity recognition (NER)  language processing, since
 systems, sentence parsers, etc. We have also developed some   In the future, we will design event frames as a major building   Identifying person, location, and organization names in documents is very
 basic techniques for knowledge extraction, such as named-  block of our learning system. We will also develop basic tech-  important for natural language understanding. In the past, we have devel-  it has no specified word
 entity recognition (NER), semantic role labeling, and relation   nologies for processing spoken languages, and music to sup-  oped a machine-learning based NER system, which won the second place
 extraction in both Chinese and biological literature. We plan   port various applications.   in 2006 SIGHAN competition, and the 1st place in 2009 BioCreative II.5   boundary.
 to extract linguistic and domain knowledge from the web   gene name normalization shared task. In recent years, we focused on the
 with crowd sourcing.   a) Knowledge-based Chinese language processing  research of using semantic rules and language patterns for NER adopting
 We will focus on the conceptual processing of Chinese docu-  Markov-Logic Network, which provides more flexibility in NER.
 b) Machine learning and pattern classification  ments. Our system will utilize the statistical, linguistic, and
 We have proposed an extremely efficient tree decomposi-  common sense knowledge derived by our evolving Knowl-  e) Chinese Textual Entailment (TE)
 tion approach to train non-linear support vector machines   edge Web and E-HowNet to parse the conceptual structures   TE is the task of identifying inferences between sentences. We have in-
 at a speedup factor of hundreds, or even thousands some-  of sentences and interpret the sentence meanings.   tegrated several NLP tools and resources, focusing on deeper semantic
 times, while still achieving comparable test accuracy.  We   and syntactic analysis to construct a Chinese TE recognition system, which
 are also pioneering a new method for ranking and select-  b) Audio (speech / music / song) processing & retrieval  achieved good performance in 2011 NTCIR-9 TE shared task.
 ing features using multiple feature   We focus on speech recognition, speaker
 subsets, and have gained advan-  recognition/segmentation/clustering,  and   3. Knowledge Representation
 tages in computing speed, test   spoken document retrieval/summarization.
 accuracy, the number of essential   Our speaker verification system was ranked   We will remodel the current ontology structures of WordNet, HowNet, and
 features  that are  ranked  above  all   2nd  in  2006  International  Symposium  on   FrameNet to achieve a more unified representation.  We designed a uni-
 irrelevant features, and the number   Chinese Spoken Language Processing.  We   versal concept representational mechanism called E-HowNet, which is a
 of essential features in the selected   have developed a prototype  TV news re-  frame-based entity-relation model E-HowNet has semantic composition and
 features.   trieval system. In regards to music, our re-  decomposition capabilities which intend to derive near-canonical sense rep-
 search focuses on vocal melody extraction,   resentation of sentences through semantic composition of lexical senses.
 query by singing/humming, music tag an-
 notation, and tag-based music retrieval.
 Our audio-tagging system was ranked 1st



 研究群
 28  Research Laboratories
 28
                                                                                                                 29
                                                                                                                 29
   24   25   26   27   28   29   30   31   32   33   34