會議號碼：2516 788 3277
In this talk, I will first review our work on learning to synthesize image and video content from image data. The underlying theme is to exploit different priors to synthesize diverse content with robust formulations. One related issue is learning effective models from limited data, and I will present several methods to advance state of the art. I will then present our image and video synthesis work using vision and language models. When time allows, I will also discuss recent findings for other vision tasks.
Ming-Hsuan Yang is a professor in Electrical Engineering and Computer Science at University of California, Merced, and a research scientist at Google. He received a Ph.D. degree in Computer Science from the University of Illinois at Urbana-Champaign in 2000. He served as a program co-chair for IEEE International Conference on Computer Vision (CVPR) in 2019 as well as Asian Conference on Computer Vision (ACCV) in 2014, and a general co-chair for Asian Conference on Computer Vision in 2016. He serves as an associate editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) from 2007 to 2011, International Journal of Computer Vision (IJCV) and Image and Vision Computing (IVC). He is co-editor-in-chief of Computer Vision and Image Understanding (CVIU). He received paper awards from CVPR, ACCV, and UIST. Yang received the Google faculty award in 2009, and Distinguished Early Career Research Award from the UC Merced senate in 2011, CAREER award from the National Science Foundation in 2012, and Distinguished Research Award from UC Merced Senate in 2015. He is a Fellow of the IEEE and ACM.