Institute of Information Science Academia Sinica
Topic: Deep Learning for Multimedia Content Analysis
Speaker: Prof. Jen-Chun Lin (Department of Electrical Engineering, Yuan Ze University)
Date: 2019-07-24 (Wed) 10:00 – 12:00
Location: Auditorium 108 at IIS Old Building
Host: Tyng-Luh Liu


In this talk, I will cover two topics which are closely related to multimedia. The first one is automatic music video generation, and the second one is automatic concert video mashup. An automated process that can suggest a soundtrack to a user-generated video (UGV) and make the UGV a music-compliant professional-like video is challenging but desirable. To this end, in the first topic we introduce a systematic approach to link a multi-task deep-net model with DTW-based metric learning, and then use it to perform music video (MV) generation. The results of objective and subjective experiments demonstrate that the proposed system performs well and can generate appealing MVs with better viewing and listening experiences. In the second topic, we aim to classify the types of shots defined by the language of film for better portraying visual storytelling in a concert video, and plan to incorporate the technique in our upcoming effort for building an automatic mashup platform. Varying types of shots are fundamental elements in the language of film, commonly used by a visual storytelling director to convey the emotion, ideas, and art. To classify such types of shots from images, we propose a novel probabilistic-based deep-net framework, and term the resulting deep-net model as Coherent Classification Net, abbreviated as CC-Net, to boost the classification accuracy. We provide extensive experiment results on a dataset of live concert videos to demonstrate the advantage of the proposed approach.


Jen-Chun Lin received the Ph.D. degree in computer science and information engineering from National Cheng Kung University, Tainan, Taiwan, in 2014. He was a Post-Doctoral Research Fellow with Academia Sinica, Taipei, Taiwan, from 2014 to 2018. He is currently an Assistant Professor with the Department of Electrical Engineering, Yuan Ze University, Taoyuan, Taiwan. The illustration in his IEEE/ACM Transactions on Audio, Speech, and Language Processing (March/April issue) 2014 paper has been chosen to highlight the cover of the journal. He also received the Gold Award from Merry Electroacoustic Thesis Award in 2014; the Excellent PhD Dissertation Award from Chinese Image Processing and Pattern Recognition Society in 2014; the Excellent PhD Dissertation Award from Taiwanese Association for Artificial Intelligence in 2014; the Most Interesting Paper Award from Affective Social Multimedia Computing Workshop in 2015; the Postdoctoral Academic Publication Award, Ministry of Science and Technology (MOST) in 2017; and the APSIPA Sadaoki Furui Prize Paper Award in 2018. His research interests include multimedia signal processing, pattern analysis and recognition, machine learning, deep learning, and affective computing.