Institute of Information Science Academia Sinica
Topic: Smart statistical analysis software
Speaker: Prof. Guan-Hua Huang (Institute of Statistics, National Chiao Tung University)
Date: 2019-02-26 (Tue) 10:00 – 12:00
Location: Auditorium106 at IIS new Building
Host: Keh-Yih Su


The implementation of traditional statistical analysis software requires many "expert opinions" and "human intervention", which has made it more difficult for non-statisticians to use the analysis software and also makes analysis automation unreachable. Our lab aims to develop a set of "smart statistical analysis software". This software first asks the user to describe the problem in a few paragraphs. The software then uses natural language processing techniques to automatically identify the problem and performs the best statistical analysis based on the information provided. The answer statement is automatically generated according to the question.

In this talk, I will first share our thoughts on developing the smart statistical analysis software and then introduce our results in "Domain-specific word embeddings" and "Hyper-parameter selection via machine learning classification".


Dr. Guan-Hua Huang joined the faculty of the National Chiao Tung University in 2003. He received his Ph.D. in Biostatistics from the Johns Hopkins School of Public Health. Dr. Huang is a Professor in the Institute of Statistics and an Adjunct Professor in the Institute of Data Science and Engineering. Between 2000-2003, he was an Assistant Professor in the Department of Population Health Sciences and also held a joint appointment with the Department of Biostatistics and Medical Informatics at the University of Wisconsin-Madison.
Dr. Huang's research is informed by the fact that in many medical studies, the definitive outcome is inaccessible, and a valid surrogate endpoint is then measured in place of the clinically most meaningful endpoint. Therefore, his primary research interest is in developing latent variable/class models for analyzing this kind of data structure.
Dr. Huang is also interested in statistical problems in genetics, genomics, and bioinformatics. He is particularly interested in statistical validation of endophenotypes, high-throughput genomic data analysis, gene-gene interaction detection, next-generation sequencing data analysis, and copy number variation identification.
Dr. Huang is deeply invested in many scientific areas, including aging, nursing, schizophrenia, diabetes, sensory impairments (hearing, vision, olfaction), and data science.