Page 30 - profile2014.indd
P. 30

Natural Language and

                                                   Knowledge Processing Lab


                Research Faculty
                Wen-Lian Hsu                      We focus on problems concerning knowledge-based information processing, a pro-
                Distinguished Research Fellow      cess which is strongly motivated by the deluge of information available on the Inter-
                                                   net. We are currently engaged in research on knowledge acquisition, representation,
                Keh-Jiann Chen                     and utilization, with special emphasis on Chinese processing.
                Research Fellow
                Lun-Wei Ku                        1. Knowledge Acquisition
                Assistant Research Fellow          Our focus is on strategies and methodologies of automating knowledge acquisition
                Keh-Yih Su                         processes.
                Research Fellow
                Hsin-Min Wang                      a) Construction of linguistic knowledge bases
                Research Fellow                      Over the past two decades, we have developed an infrastructure for Chinese lan-
                                                     guage processing that includes part-of-speech tagged corpus, tree-banks, Chi-
                                                     nese lexical databases, Chinese grammatical elements, InfoMap, word identifica-
                                                     tion systems, and sentence parsers, among other components. We plan to make
                                                     use of these tools to extract linguistic and domain knowledge from the web with
                                                     crowd sourcing.

                                                   b)  Pattern-based information extraction (IE)
                                                     Most pattern-based IE approaches are initiated by manually providing seed in-
                                                     stances. We have proposed a semi-supervised method that can manage a large
                                                     quantity of seed instances with diverse quality. Our strategy can provide flexible
                                                     frame-based pattern matching and summarization.

                                                  2.  Knowledge Representation

                                                  We will remodel the current ontology structures of WordNet, HowNet, and FrameNet
                                                   to achieve a more unified representation. We have designed a universal concept
                                                   representational mechanism called E-HowNet, a frame-based entity-relation model.
                                                   E-HowNet has semantic composition and decomposition capabilities, which are
                                                   intended to enable it to derive near-canonical sense representation of sentences
                                                   through semantic composition of lexical senses.

                                                  3.  Knowledge Utilization
                                                   Our Chinese input system, GOING, is used by over one million people in Taiwan. Our
                                                   knowledge representation kernel, InfoMap, has been applied to a wide variety of
                                                   application systems. In the future, we will design event frames as a major building
                                                   block of our learning system. We will also develop basic technologies for processing
                                                   spoken languages, and music to support various applications.

                                                   a) Knowledge-based Chinese language processing
                                                    We will focus on the conceptual processing of Chinese documents. Our system
                                                    will utilize the statistical, linguistic, and common sense knowledge derived by our
                                                     evolving Knowledge Web and E-HowNet to parse the conceptual structures of
                                                     sentences and interpret the meanings of sentences.

                                                   b)  Audio (speech/music/song) processing & retrieval
                                                     Our goal is to develop methods for analyzing, extracting, recognizing, indexing,

          30    研究群 Research Laboratories
   25   26   27   28   29   30   31   32   33   34   35