Electronic Theses and Dissertations
Date
2024
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Mathematical Sciences
Committee Chair
Hongmei Zhang
Committee Member
E. Olusegun George
Committee Member
Hui Zhang
Committee Member
Ching-Chi Yang
Abstract
High dimensional data is widely studied in many areas, such as in epigenetic study and biological science. However, there is challenging to develop and apply clustering methods to high dimensional data. In this dissertation, we propose two clustering methods to study DNA methylation (DNAm) data in one-dimensional and stochastic optical reconstruction microscopy (STORM) image data in two-dimensional. DNAm changes are known to be associated with different age stages, and the mapped genes could be linked to incidence of diseases. Thus, learning the features of DNAm at different CpG sites will benefit subsequent epigenetic epidemiological studies on marker detections for health outcomes or exposures. However, currently no methods are available to effectively and efficiently identify dynamic and stable CpGs. Hence, we develop a Bayesian two-stage clustering method to 1) determine whether DNAm at a CpG site was stable over time, and 2) assign each unstable CpG site into a specific cluster based on temporal trend of DNAm at that site. We use simulations to demonstrate and evaluate the proposed method. Real data application to M-values of DNA methylation at 325 subjects with 2,000 CpG sites at two time points from Isle of Wight birth cohort is then used in the demonstration. STORM is a Single Molecule Localization Microscopy (SMLM) technique. SMLM images provide an opportunity to observe isolated physical locations of observed proteins and study protein interactions by analyzing point patterns using spatial point theories. Previous popular methods for this type of problems are largely for single image analysis and have different limitations, such as, heavy load of simulations, false positivity etc. Hence, we are motivated to propose new methods to overcome these limitations. We develop standardized statistics to study location patterns (clustered, dispersed or random) in one species for 2D images to compare across different treatment conditions or study groups. Simulations are used to demonstrate and assess the developed method. We then apply the approach to 2D STORM image data generated from the Department of Immunology, St. Jude Children’s Research Hospital. The data set composed of x, y co-ordinates of mitochondria and lysosomes at 10 cells under two conditions.
Library Comment
Dissertation or thesis originally submitted to ProQuest.
Notes
Open Access
Recommended Citation
Han, Luhang, "Bayesian approach of clustering one-dimensional data and identifying pattern of two-dimensional data via standardized statistics" (2024). Electronic Theses and Dissertations. 3494.
https://digitalcommons.memphis.edu/etd/3494
Comments
Data is provided by the student.