Electronic Theses and Dissertations
Identifier
449
Date
2011
Document Type
Thesis
Degree Name
Master of Science
Major
Biology
Committee Chair
Tit- Yee Wong
Committee Member
King Thom Chung
Committee Member
Lih Yuan Deng
Abstract
Metagenomics is the study of microbes in their natural environments without the need for isolation and lab cultivation. The DNA fragments obtained from sequencing of a sample of mixed species requires taxonomic characterization called binning. My research concerns binning of metagenomic data using a novel approach. Each genomic sequence was codified based on their Cistronic Stop Signal Ratio (CSSR) values. Since the genic CSSR values of phylogenetically related organisms often share a definable pattern, a neural network was trained to recognize the genic CSSR patterns of known species.The trained neural network was then used to cluster the CSSR values from the metagenomic data. To show the validity of this method, a total of 15,000 genic CSSR values were calculated from five different bacterial species. The data was randomly mixed and a neural network was used to recognize the originality of these genes, based on their unique CSSR values. Results showed that better than 95% of the genes were correctly binned to the rightful species. The metagenomic sequences from the fecal samples of 124 individuals were reanalyzed based on the CSSR - neural network method by training the genic values of a set of known enteric bacteria. The resulting clusters were discussed.
Library Comment
Dissertation or thesis originally submitted to the local University of Memphis Electronic Theses & dissertation (ETD) Repository.
Recommended Citation
Bhaskarabhatla, Rekha, "Binning Metagenomic Data by CSSR" (2011). Electronic Theses and Dissertations. 357.
https://digitalcommons.memphis.edu/etd/357
Comments
Data is provided by the student.