Access Type

Open Access Thesis

Date of Award

January 2014

Degree Type

Thesis

Degree Name

M.S.

Department

Computer Science

First Advisor

Hamidreza Chitsaz

Abstract

Recent progress in DNA amplication techniques, particularly multiple displacement amplication (MDA), has made it possible to sequence and assemble bacterial genomes from a single cell. However, the quality of single cell genome assembly has not yet reached the quality of normal multicell genome assembly due to the coverage bias and errors caused by MDA. Using a template of more than one cell for MDA or combining separate MDA products has been shown to improve the result of genome assembly from few single cells, but providing identical single cells, as a necessary step for these approaches, is a challenge. As a solution to this problem, we give an algorithm for de novo co-assembly of bacterial genomes from multiple single cells. Our novel method not only detects the outlier cells in a pool, it also identies and eliminates their genomic sequences from the nal assembly. Our proposed co-assembly algorithm is based on colored de Bruijn graph which has been recently proposed for de novo structural variation detection. Our results show that de novo co-assembly of bacterial genomes from multiple single cells outperforms single cell assembly of each individual one in all standard metrics. Moreover, our de novo co-assembly also outperforms the mixed assembly in which the input datasets are simply concatenated. We implemented our algorithm in a software tool called HyDA which is available from http://chitsazlab.org/software/hyda.

Share

COinS