Access Type

Open Access Dissertation

Date of Award

January 2016

Degree Type


Degree Name



Molecular Biology and Genetics

First Advisor

Roger Pique-Regi

Second Advisor

Francesca Luca


Advances in next-generation sequencing technologies and functional genomics strategies have allowed researchers to identify both common and rare genetic variation, to deeply profile gene expression, and even to determine regions of active gene transcription.

While these technologies and strategies have contributed greatly to our understanding of complex traits and diseases, there are many biological questions and analytical issues to be addressed.

Genome-wide association studies (GWAS) have successfully identified large numbers of genetic variants associated with complex traits and diseases. However, in many cases the mechanistic link between the phenotype and associated variant remains unclear. This may be because most variants identified by GWAS lie outside coding regions, and likely affect regulatory regions that are not well characterized. Chapter 2 describes a computational approach to integrate functional genomics data with DNA sequence models to predict which variants in a DNase I footprint affect transcription factor binding. These predictions prove useful in assessing which variants in GWAS association regions are the likely causal variants.

While a mismatch between genotype and environment can lead to an increased disease risk, it can be difficult to study the role of the environment directly. This is because environmental covariates are complex and difficult to control at the organismal level. We can alternatively study the cellular environment using in vitro treatments as proxy for the organismal environment. Chapters 3 and 4 describe a high-throughput system for the characterization of gene expression response using a panel of cell types and treatment conditions. Chapter 4 identifies genes with GxE, and follow-up studies have shown that they are also associated with a phenotype through GWAS, providing putative molecular mechanisms through which the environment influences the trait.

Together, these chapters present novel approaches and analyses of next-generation sequencing data to identify functional variation and gene-environment interactions.