Page 31 - profile2014.indd

P. 31

Natural Language and

Knowledge Processing Lab

and retrieving information from audio data, with special to improve performance utilizing the Sinica parser, semantic
emphasis on speech and music. For speech, our research role labels, and e-Hownet.
focuses on speaker recognition, spoken language recogni-
tion, voice conversion, and spoken document retrieval/sum- g) Semantic-Oriented Machine Translation:
marization. As regards music, our ongoing research topics We adopt the deep syntactic structure with lexicon sense for
include vocal melody extraction, automatic music tagging, each token and case-label at each node. An integrated sta-
music emotion recognition, and music search. Our audio- tistical model is used to search the most likely combination
tagging system was ranked 1st in the 2009 Music Informa- of parse-tree, lexicon senses and node-case-labels (i.e., the
tion Retrieval Evaluation eXchange (MIREX2009). Our work best path). After the desired source semantic normal form is
on acoustic visual emotion Gaussians modeling for auto- obtained, the corresponding target semantic normal form
matic music video generation won the ACM Multimedia and the target string is then generated according to the pat-
2012 Grand Challenge First Prize. terns and parameters automatically learnt from those select-
ed paths. For each unreachable sentence, a surrogate path
c) Chinese question answering system will be created by searching the path (within the searching
We integrated several Chinese NLP techniques to con- beam) that possesses the maximum value of the specified
struct a Chinese factoid QA system, which won first place in function (of associated sentence-level BLEU score and likeli-
NTCIR-5 and NTCIR-6. In the future, we will extend the sys- hood value).
tem to answer “how” and “what” types of question.
h) Chinese Natural Language Understanding:
d) Named entity recognition (NER) We will build a Chinese natural language understanding
Identifying person, location, and organization names in doc- system based on various analysis modules (e.g., word seg-
uments is very important for natural language understand- menter, parser, semantic role labeler, logic form transformer)
ing. In the past, we developed a machine-learning-based that we have previously constructed. We plan to start this
NER system, which won second place in the 2006 SIGHAN long-term research project with a Chinese machine reading
competition, and first place in the 2009 BioCreative II.5 gene program which can be evaluated with reading comprehen-
name normalization shared task. In recent years, we have sion tests. This project is expected to start from elementary
focused on using semantic rules and language patterns for school texts, and gradually shift to high school and then real
NER-adopting Markov-Logic Network, which provides more domain-oriented applications (e.g., Intelligent Q&A).
flexibility for NER.

e) Chinese Textual Entailment (TE)
TE is the process of identifying inferences between sen-
tences. We have integrated several NLP tools and resources,
focusing on deeper semantic and syntactic analysis to con-
struct a Chinese TE recognition system, which performed
well in the 2011 NTCIR-9 TE shared task.

f) Sentiment Analysis and Opinion Mining
Processing subjective information requires a deep under-
standing of the subject matter. We have studied opinions,
sentiments, subjectivities, effects, emotions, and views in
texts such as news articles, blogs, forums, reviews, com-
ments, and dialogs, and developed related analysis tech-
niques for Chinese and English. With developed techniques
of sentiment analysis, we built Feelit, a web-post emotion
visualization system, and RESOLVE, a writing system for ESL
learners, to help users understand and learn to express their
emotions. Based on their promising results, we will continue Fast Input Software Déjà vu on cell phone

26 27 28 29 30 31 32 33 34 35 36