Access Type

Open Access Dissertation

Date of Award

January 2014

Degree Type

Dissertation

Degree Name

Ph.D.

Department

Computer Science

First Advisor

Dongxiao Zhu

Abstract

Alternative splicing plays a key role in regulating gene expression, and more than 90% of human genes are alternatively spliced through different types of alternative splicing. Dysregulated alternative splicing events have been linked to a number of human diseases. Recently, high-throughput RNA-Seq technologies have provided unprecedented opportunities to better characterize and understand transcriptomes, in particular useful for the detection of splicing variants between healthy and diseased human transcriptomes.

We have developed two novel algorithms and tools and a computational workflow to interrogate human transcriptomes between healthy and diseased conditions. The first is a read count-based Expectation-Maximization (EM) algorithm and tool, which is called RAEM. It estimates relative transcript isoform proportions by maximizing the likelihood in each gene. The RAEM algorithm has been encoded in our published software suite, SAMMate. We have employed RAEM to predict isoform-level microRNA-155 targets. The second is called dSpliceType, which is a read coverage-based algorithm and tool to detect differential splicing events. It utilizes sequential dependency of normalized base-wise read coverage signals and a change-point analysis, followed by a parametric statistical hypothesis test using Schwarz Information Criterion (SIC) to detect significant differential splicing events in the form of the five well-known splicing types. The results of both simulation and real-world studies demonstrate that dSpliceType is an efficient computational tool for detecting various types of differential splicing events from a wide range of expressed genes. Finally, we developed a novel computational workflow to jointly study human diseases in terms of both differential expression and differential splicing. The workflow has been used to detect differential splicing variants from non-differentially expressed genes of human idiopathic pulmonary fibrosis (IPF) lung disease.

Share

COinS