Access Type

Open Access Thesis

Date of Award

January 2014

Degree Type

Thesis

Degree Name

M.S.

Department

Computer Science

First Advisor

Hamidreza Chitsaz

Abstract

Computational RNA secondary structure prediction has been a topic of much research interest for several decades now. Despite all the progress made in the field, even the state-of-the-art algorithms do not provide satisfying results, and the accuracy of output is limited for all the existent tools. Very complex energy models, different parameter estimation methods, and recent machine learning approaches had not been the answer for this problem. We believe that the first step to achieve results with high quality is to use the energy model with the potential for predicting accurate output. Hence, it is necessary to have a systematic way to analyze the suitability of an energy model. We introduced the notion of learnability to measure this suitability. A learnable energy model has at least one subset of parameters that can render every known RNA to date the minimum free energy structure, which means 100% accuracy. We also found the necessary condition for a model to be learnable and implemented the dynamic programming based algorithm to asses this condition for a set of RNAs. This algorithm computes the convex hull of all possible feature vectors for a sequence. With the partition function as a polynomial, this convex hull is also the Newton polytope of the partition function. To the best of our knowledge, this is the first systematic approach for evaluating the inherent capability of an energy model.

Share

COinS