Page 24 - profile2012.indd

P. 24

Research Laboratories 研究群

多媒體技術實驗室

Multimedia Technology Laboratory

Research Faculty
Research Faculty

Hong-Yuan Mark Liao Chu-Song Chen Wen-Liang Hwang Chun-Shien Lu Tyng-Luh Liu
Research Fellow Research Fellow Research Fellow Associate Research Fellow Research Fellow

Group Proﬁle
In the past two decades, multimedia technology influences corders, to perform evidence mining on these videos is more puter vision. Besides, due to the considerable growing of data amount in the
many aspects of our daily life. Besides biotechnology and na- challenging. We shall start by addressing the multiple-camera Internet age, training in large-scale (and possibly noisy) datasets becomes a
notechnology, multimedia technology has been considered people counting problem as well as visual knowledge transfer significant issue. Furthermore, instead of observing the world only with color
one of the three most promising industries of the twenty-first among a heterogeneous collection of surveillance camcorders. images in common viewing angles, 3D imaging (providing further depth in-
century. Multimedia research covers a broad scope of tech- formation) and flying camera (providing more un-common viewing angles
niques and rich applications, including those working on mu- B. Compressive Sensing and Sparse Representation from bird’s eye views) could also bring us chances for developing novel ap- Multimedia shapes
sic, video, image, text, and 3-D animation. plications in the near future. High-level visual concepts, such as aesthetics,
Compressed Sensing/Sampling (CS) is a revolutionary technol- have also been shown the possibility of being tackled by machine learning. our future.
In the upcoming few years, we would continue to devote our ogy of simultaneously sensing and compressing signals, and To address the above issues, we will study several topics toward understand-
research efforts in advancing the key fields in multimedia, in- builds a new sampling theorem beyond the Nyquist rate. It ing visual information from multi-perspectives: (1) object detection, recogni-
cluding multi-perspective computer vision, compressive sens- enables to finish joint data acquisition and compression with tion, and segmentation from visual saliency, (2) tracking and interacting with
ing/sparse representation, video forensics, etc. In what follows, slight cost at the encoder (for resource-limited mobile devices flying cameras, (3) on-line aesthetic value assessment when shooting, and
we shall describe in details some key fields. and sensors) but shift major computational overhead to the (4) deriving the 3D structure of conventional camera images. The research
decoder. Based on the assumption of signal’s sparsity, CS, in outcomes are expected to be helpful in making computers understand hu-
A. Video Forensics theory, can perfectly reconstruct the original signal from (far) man intension, assisting human with better-quality and more-safety life, and
fewer measurements via convex optimization or greedy algo- supporting robot to see and understand the world better.
Since the 911 attacks on the United States, counter-terrorism rithms. This completely new idea makes CS a hot topic in signal
strategies have been given a high priority in many countries. processing-related fields since its first appearance in 2006. Fur-
Surveillance camcorders are now almost ubiquitous in modern thermore, for the problems that are inherent sparse or can be
cities. As a result, the amount of recorded data is enormous, sparsified, CS have been adopted in broad areas.
and it is time-consuming to search the digital video content
manually. In this next few years, we shall put part of our ef- Undoubtedly, this emerging area opens opportunities for the
fort on video forensics, in which a major proportion of related study of fundamental issues and application-oriented prob-
research work is to perform mining for criminal evidence in lems. In the future, we will plan to study the following topics:
videos recorded by a heterogeneous collection of surveillance (1) Fast Compressed Image Sensing (CIS); (2) Fast Orthogonal
camcorders. This is a new interdisciplinary field, and people Matching Pursuit (FOMP); (3) Multiple input systems exploiting
working in the field need video processing skills as well as an sparse representation (e.g., microphone array signal process-
in-depth knowledge of forensic science; hence the barrier for ing); and (4) single-pass codeword learning for sparse repre-
entering the field is high. Mining surveillance videos directly sentation.
for criminal evidence is very different from conventional crime
scene investigations. In the latter, detectives need to actually C. Multi-perspective computer vision
visit the crime scene, check all available details and collect as
much physical evidence as possible. By contrast, to conduct Making computers capable of perceiving the real-world visual
crime scene investigations directly from surveillance videos, information from various clues is challenging because of high-
forensic experts need to develop software that facilitates the complexity conceptions, changing environments, free motion,
automatic detection, tracking, and recognition of objects in the high articulations, and so on. As many visual concepts are diffi-
videos. Since the videos are captured by heterogeneous cam- cult to be summarized in simple and plain rules, (statistical) ma-
chine learning has played an important role in the past decade
(as witnessed in the main conferences such as CVPR, ICCV, and
NIPS), and is still expected to be vital to the progress of com-

研究群
24 Research Laboratories
24
25
25

19 20 21 22 23 24 25 26 27 28 29