Page 14 - profile2014.indd
P. 14

Decode life, Explore the unknown.




                                                   Bioinformatics Lab




                 Lab




                Research Faculty
                Ting-Yi Sung                       Our current research is focused on bioinformatics for “omics” studies, classified into two
                                                   main  areas:  (i)  genomics  and  transcriptomics,  and  (ii)  proteomics  and  metabolomics.
                Research Fellow
                                                  These areas are described below.
                Jan-Ming Ho
                Research Fellow                   1. Genomics and Transcriptomics Studies
                Wen-Lian Hsu
                Distinguished Research Fellow     With the ascension of next-generation sequencing (NGS) as a predominant technology
                                                   for genome and transcriptome studies, we have devoted ourselves to developing new
                Chung-Yen Lin                      methodologies and tools for analyzing NGS data. First, we have proposed computational
                Associate Research Fellow          methods to assemble high-throughput short read sequences; to this end, we have devel-
                Arthur Chun-Chieh Shih             oped a de novo assembler, called JR-Assembler. This tool can assemble a giga-base-pair
                Research Fellow                    genome from Illumina short reads, and is effective in memory usage and efficient in CPU
                                                   time. Second, we have proposed an automated metagenomic data-processing pipeline,
                Huai-Kuang Tsai                    called MetaABC, which integrates several binning tools coupled with methods for remov-
                Associate Research Fellow          ing artifacts, analyzing unassigned reads, and controlling sampling biases, to achieve less
                                                   biased analysis. Third, in order to uncover secrets within the massive collection of om-
                Postdoctoral                       ics data from model and non-model organisms, we have integrated several open source
                Yu-Jung Chang                      software packages with our own tools, enabling NGS and other omics data to be com-
                                                   bined and analyzed at our web server, called Multi-Omics Online Analysis System (avail-
                Chia-Ying Cheng                    able at http://molas.iis.sinica.edu.tw). Fourth, we are working on read alignment; NGS
                Te-Chin Chu                        reads are getting longer, but most existing short-read aligners were developed and opti-
                Chan-Hsien Lin                     mized for 100bp reads or shorter. We have developed a new alignment algorithm, called
                                                   Kart, which can efficiently produce reliable, longer alignments with a low error rate, and
                Hsin-Nan Lin                       can tackle PacBio reads with high accuracy. Fifth, to address the increase in computa-
                Ke-Shiuan Lynn                     tion power required for biological research, we have collaborated with colleagues in our
                                                   institute to implement a user-friendly tool for biologists, called CloudDOE (http://cloud-
                Zing Tsung-Yeh Tsai                doe.iis.sinica.edu.tw/); this tool involves a Hadoop cloud which can substantially reduce
                Yu-Wei Tsay                        the complexity and costs of deployment, execution, enhancement, and management of
                                                   computation resources.

                                                  We have been using the aforementioned methods and tools to tackle various biological
                                                   problems, and the following in particular: (a) gene duplication in C4 plant leaf evolution,
                                                   (b) reconstruction of regulatory networks of maize leaf development, (c) integration of
                                                   transcription factors, miRNAs, and epigenetic information to study gene regulation, (d)
                                                   reconstruction of miRNA-gene regulatory networks in cardiac hypertrophy and B cell dif-
                                                   ferentiation, (e) identification of structural variations in the autism genome, (f) functional
                                                   analysis of non-coding RNAs in human, (g) identification of druggable oncogene fusions
                                                   and the underlying mechanisms, and (h) viral genome recombination and genotyping.

                                                  2. Proteomics and Metabolomics

                                                   Mass Spectrometry (MS)-based proteomics and metabolomics. MS has become the pre-
                                                   dominant technology for proteomics research. Protein identification and quantitation are
                                                   the two main purposes of mass spectral analysis. Previously we focused on developing
                                                   bioinformatics systems for quantitation analysis, through which we created three tools,
                                                   i.e., MaXIC-Q, MaXIC-Q, and IDEAL-Q, for various experimental quantitation approaches.
                                                   Currently, we are working on improving protein identification. First, we have recently
                                                   proposed novel methods for glycoprotein identification (including the implementation
                                                   of an automated tool called MAGIC), since glycosylation is considered to be the most
                                                   important post-translational modification (PTM), and analysis of MS/MS data acquired
                                                   from glycoproteomic experiments is challenging. Second, we have proposed a method
                                                   to utilize SWATH-MS data for protein identification. SWATH is a data-independent ac-
                                                   quisition method developed in recent years primarily for targeted proteomics analysis,
                                                   and this method has since attracted considerable attention. Since the high-throughput
                                                   data generated from SWATH-MS is mainly used for targeted proteomics analysis, we pro-






          14    研究群 Research Laboratories
   9   10   11   12   13   14   15   16   17   18   19