Access Type

Open Access Dissertation

Date of Award

January 2010

Degree Type


Degree Name



Computer Science

First Advisor

Sorin Draghici


The development of high throughput technologies such as DNA microarrays has enabled researchers to measure expression levels on a genomic scale. Correct and efficient biological interpretation of the voluminous data generated by these technologies, however, remains a challenging problem. A commonly used approach in interpreting the results of such high throughput experiments is to map the list of differentially expressed (DE) genes to gene ontology (GO) terms, which provides a list of biological processes, biochemical functions, and cellular locations associated with the DE genes. A previously unexplored aspect is the identifications of unusual associations between biological processes. Such associations may be signaling biological processes that interact in a specific way in the condition under study. Here we present a novel approach that aims at identifying such associations between biological processes that are significantly different in a given phenotype with respect to the normal. We used our approach on two real data sets involving breast and lung cancer, and predicted associations among biological processes of the GO ontology that were annotated with differentially expressed genes. More than 89% of the predicted associations were found to be correct and valid by an extensive manual review of literature. A subset of such interactions was discussed in details and shown to have the potential to open a number of new avenues for research in lung and breast cancer. These results indicate that the idea of expanding our interpretation efforts beyond single processes may be useful in understanding specific experiments.