Page 20 - profile-ok

P. 20

研究群｜ Research Laboratories

Information Processing and

Discovery (iPAD) Laboratory

Research Faculty Group Profile

Meng-Chang Chen The Information Processing and Discovery (iPAD) group of IIS focuses its research on (1) massive ● Large graph access research topics: databases delivered as a service and data mining
Research Fellow data computation and (2) data mining technologies and applications. For massive data mining, as a service.
Computer Science , University of California, Los there are several ongoing research projects, including knowledge representation and inference, Graphs serve as the basic model for many applications, such as so-
Angles cial networks, games, or disease spreading. For some real applica- We want to design online databases for multi-groups of multi-users.
privacy risk, and large graph access. For data mining, current interests include uncertain data
Ming-Syan Chen processing, social network mining, data mining in the cloud, data streaming mining and multi- tions, a basic graph model is too large to be stored entirely within The size and power of each database is allocated on demand. With
Distinguished Research Fellow mode mining. the main memory with current technology. Here, good examples a new way of organizing data (e.g., relational v.s. map/reduce), cor-
Computer, Information and Control Engineering , include graphs for the Chinese version of chess, and individual- responding indexing methods, concurrency control, privacy pres-
The University of Michigan at Ann Arbor
based disease spreading graphs. In order to handle such massive ervation, and OLTP/OLAP services should all be developed. We also
Tsan-sheng Hsu 1. Massive Data Computation graphs we must modify original algorithms or develop new algo- would like to design a framework that provides data mining appli-
Research Fellow rithms specifically tailored for this challenge. Currently, we have cations on demand. By extending traditional parallel and distrib-
Computer Sciences , University of Texas at Austin
● Agent based knowledge representation and inference some interesting results for the Chinese end-game graphs and the uted mining techniques, we will develop methods that deal with
Hong-Yuan Mark Liao models of disease, and we plan to further our understanding of this input data from multiple sources, utilize cloud resources efficiently,
Research Fellow A great deal of information and knowledge is implicit within massive amounts of data. An problem’s underlining algorithmic issues while also seeking new and report the final mining results to multiple subscribers.
Electrical Engineering , Northwestern University important issue is to study the knowledge representation and inference problems relevant applications for our methods.
Churn-Jung Liau to intelligent agent systems based on formal logics. We study methods to inductively pro-
Research Fellow duce useful rules and knowledge, with special attention to the representation problems of ● Data Stream Management and Mining
CSIE , National Taiwan University such extracted knowledge. With a proper knowledge representation framework, derived 2. Data Mining Technologies and Applications
Da-Wei Wang knowledge can serve as the basis for further reasoning and decision making within intel- More and more applications are now dealing with data in a form
Research Fellow ligent agent systems. Different agent systems can exchange knowledge based on common ● Uncertain Data Mining and Query Processing of quickly growing streams. Examples include stock market trading,
Computer Science , Yale University representation, and agent systems can produce even more complex knowledge by invoking In order to protect privacy, people deliberately introduce distur- sensor network data analysis, weather forecast applications, and
Mi-Yen Yeh proper fusion mechanisms to incorporate knowledge from various sources. Fully distributed bances to their confidential data before further processing. As video surveillance. The data streams in these applications usually
Assistant Research Fellow yet coordinated knowledge extraction and processing mechanisms can derive useful knowl- involve huge volumes of data that are constantly arriving at fast
Electrical Engineering , National Taiwan University edge for decision making from massive data sets. This can effectively mitigate the informa- a result, this data is no longer deterministic, and instead is better incoming speeds. With limited computing resources and storage,
tion explosion problem for decision makers. described as random variables with unknown probability distribu- we need to design real-time and approximate algorithms that can
tions, called uncertain data. In contrast to deterministic data, new accommodate the speed and massive size of these data streams. In
probabilistic data models and distance metrics for uncertain data the data stream environment, our research can be divided in two
should be designed to deal with the errors caused by this uncer-
● Privacy risk and threat aspects: 1) design data summarization techniques and synopsis
tainty. Moreover, we want to design probabilistic query process- structures; 2) design mining algorithms under different data stream
Many institutions collected massive and comprehensive individual data for various purposes, ing methods for uncertain data. Our research results will be fur- models. In addition, we will also ensure that the approximated re-
and sharing this data poses a great threat to individual privacy. In the past, we proposed a ther extended to query processing in mobile applications, such as sults still meet the quality requirements for applications that need
logic framework to study the risks to privacy when publishing micro-data, and a quantitative location-based services where the positions of objects are usually real-time decisions.
measurement of this privacy threat. Based on the proposed measurement, we designed and estimated with uncertainty.
developed a gatekeeper system, CellSecu. In the future, we plan to study a more challenging
issue: the database linkage problem—how to get the intended final answers without really ● Multi-Mode Mining
linking the databases. In doing so we can apply multiparty privacy computation techniques ● Social Network Mining
and we plan to use secure scalar products as a building block to construct basic functions Multi-mode mining are emerging for the novel applications that re-
so as to construct various application systems. The ultimate goal is to develop a complete As growth in social networking applications explodes, large quire the distinguished knowledge discovered from multi-sources
system, in which users can write their program in certain high level languages and then have amounts of rich types of data have rapidly emerged. A great deal of and understand their association and influence. A typical example
it translated into secure multiparty codes automatically. interesting information is hidden within these new types of data. We is stock market prediction application which may need to mine
intend to discover useful knowledge from analyzing social network trading behavior from data stream of trading system as well as news
data, and then use this information to further develop innovative events from news articles. The fusion of discovered knowledge form
applications and services. Our research topics include establishing multiple sources are difficult as they are in different forms and with
a systematic data collection module, investigating new algorithms incompatible semantics and timing. In this study, we will use pre-
for significant node identifications and community detections in diction market as our research context to investigate multi-mode
social networks, and designing progressive and incremental algo- mining issue.
rithms to adapt to the dynamic properties of social networks.

● Data Management and Mining in the Cloud Environment

Cloud computing is now listed as one of the major important
emerging industries in the country. In this new computing environ-
ment, where the concepts of Software as a Service (SaaS) and Plat-
form as a Service (PaaS) are realized, we have the following possible
20 21

15 16 17 18 19 20 21 22 23 24 25