Page 22 - profile2014.indd
P. 22
“The search for truth takes you where the evi-
dence leads you, even if, at first, you don’t want
to go there.” Bart D. Ehrman,
Data Management and
Information Discovery Lab
Lab
Research Faculty
Meng Chang Chen In the data explosion era, data of various types (e.g., sensor, trajectory, transaction, mul-
timedia, social network, Web browsing log, etc.) are generated at an increasing rate.
Research Fellow
Due to the abundant and inexpensive trend of hardware and networks, the timing has
Yuan-Hao Chang never been better to explore all possible means of utilizing such data to enhance exist-
Assistant Research Fellow ing applications or to investigate new technologies to solve difficult problems. The Data
Management and Information Discovery Group was formed with the main objectives of
Ming-Syan Chen initiating innovative research and strengthening scientific and technological excellence
Distinguished Research Fellow in: (1) effective collection, representation, storage, processing, and analyzing of massive
Hong-Yuan Mark Liao data; and (2) exploring data mining technologies to efficiently and effectively discover
Distinguished Research Fellow valuable knowledge within various types of data. Currently, the research of this group fo-
cuses on the following areas: (1) Time Series Data Analysis and Mining; (2) Social Network
De-Nian Yang Analysis and Query Processing; (3) Designs for Embedded Databases on Mobile Devices
Associate Research Fellow and Sensors; and (4) Location-based Data Collection Platform and Applications. Within
Mi-Yen Yeh these research areas, we are conducting the following projects.
Associate Research Fellow
1. Time Series Data Analysis and Mining
A time series is a sequence of data of consecutive time instances spaced at uniform/non-
uniform time intervals. Time series data are widely seen in daily life, including periodical
readings of sensors in a machine-to-machine environment, stock trading records in fi-
nancial markets, user activity monitoring in social networks, GPS traces of mobile objects,
and so on. By analyzing and mining time series data, we can capture the characteristics
of monitored objects over a period of time and obtain useful knowledge for developing
further applications. For example, (i) the co-evolving trend mined from the stock trading
time series can be provided to stock program traders as decision support, and (ii) the
moving behavior derived from huge GPS trajectories of humans and vehicles are useful
for developing location-based services or urban planning.
Through this line of research, we aim to develop effective and efficient mining and ana-
lytical algorithms to discover interesting and useful patterns embedded in a single time
series and between multiple ones, while considering various constraints. The designed
algorithms should be able to handle time series data series that are rapidly-growing,
high-dimensional, and huge-volume, while providing high-quality results. With relation
to the aforementioned applications, we have designed offline/online clustering algo-
rithms for multiple time series streams, similarity queries on distributed time series us-
ing both the Euclidean and dynamic time warping distances, random error reduction for
similarity searches of time series, and GPS trajectory mining and search algorithms. In
the future, we will seek more useful applications for time series analysis and mining, and
design related algorithms.
2. Social network analysis and query processing
With the emergence of various social applications, social networks have become increas-
ingly important in information sharing and analysis. An article published in Nature in
2013 demonstrated that social influence via online social networks (such as Facebook)
impacts people’s decisions, and many new companies in Silicon Valley (such as Klout)
now provide effective mechanisms to quantify social influence and diffusion for business
marketing. Nevertheless, the literature mostly focuses on passive applications of social
influence. By contrast, we propose social influence for active friending, where each user
actively specifies a friending target, and the search engine returns a group of intermedi-
ate nodes for the user to friend the target systematically with social influence. We prove
that given even the intermediate nodes, quantifying the social influence is NP-Hard. We
have also proven that finding the best intermediate nodes is NP-Hard, but the problem
can be solved in polynomial time in a simplified renowned social graph model in the
literature. In addition, we have explored the effect of social influence on product sales.
22 研究群 Research Laboratories