It is estimated that 10-30% of disease-associated genetic variants affect splicing. Splicing variants may generate deleteriously altered gene product and are potential therapeutic targets. However, systematic diagnosis or prediction for splicing variants is yet to be established, especially for the near-exon intronic splice region. The major challenge lies in the redundant and ill-defined branch sites and other splicing motifs therein. Here, we carried out unbiased massively parallel splicing assays on 5,307 disease-associated variants overlapped with branch sites and collected 5,884 variants across the 5’ splice region. We found that strong splice sites and exonic features preserve splicing from intronic sequence variation. While the splicing altering mechanism of the 3’ intronic variants is complex, that of the 5’ is mainly splice site destruction. Statistical learning combined with these molecular features allows precise prediction for altered splicing from an intronic variant. This statistical model provides identity and ranking of biological features that determine splicing, which serves as transferable knowledge, and out-performs the benchmarking predictive tool. Moreover, we demonstrated that intronic splicing variants may associate with disease risks in human population. Our study elucidates the mechanism of splicing response of intronic variants, which classify disease-associated splicing variants for the promise of precision medicine.
Dr. Chien-Ling Lin is currently an Assistant Professor of Institute of Molecular Biology, Academia Sinica. Her PhD training in University of Massachusetts Medical School focused on the translation regulation by RNA-protein interactions controlled by sequence motifs in the mRNA untranslated region. Then during the postdoctoral training in Brown University, she moved toward systematic understanding of RNA splicing regulation. A major goal of Dr. Lin’s team is to decipher the role of sequence variation in the non-coding regions in controlling the gene expression. With interdisciplinary approaches of statistical learning and molecular biology, the team aims to elucidate the underlying rule of RNA fate determination.