Electronic Theses and Dissertations

Comparing and Contrasting Clustering Analysis Methods: K-means and Vector in Partition

Identifier

6236

Lauren Sobral

Date

2018

Document Type

Thesis

Degree Name

Master of Science

Major

Mathematical Sciences

Concentration

Statistics

Committee Chair

Meredith Ray

Committee Member

Dale Bowman

Committee Member

Su Chen

Abstract

This paper delves into the similarities and differences between two methods of exploratory cluster analysis, K-means and Vector in Partition. Known as the traditional clustering approach, K-means does have some limitations when dealing with clustering complex datasets, specifically datasets with variables of multidimensional vectors. This is the gap the Vector in Partition (VIP) algorithm aims to fill. As a novel approach for clustering multidimensional datasets of both continuous and categorical data, the VIP algorithm has preliminary results that support its ability to correctly cluster simulated datasets of the genetic factors, gene expression, DNA methylation, and single nucleotide polymorphisms. After explaining both the K-means algorithm and the VIP algorithm, an example will be presented of simulated genetic data containing variables with multidimensional vectors that will be analyzed with both algorithms. The results will then be summarized using accuracy, sensitivity, and specificity while highlighting the benefits and limitations of each clustering method.

Comments

Data is provided by the student.

Library Comment

Dissertation or thesis originally submitted to the local University of Memphis Electronic Theses & dissertation (ETD) Repository.

Recommended Citation

Sobral, Lauren, "Comparing and Contrasting Clustering Analysis Methods: K-means and Vector in Partition" (2018). Electronic Theses and Dissertations. 1863.
https://digitalcommons.memphis.edu/etd/1863

Download

COinS

Electronic Theses and Dissertations

Comparing and Contrasting Clustering Analysis Methods: K-means and Vector in Partition

Identifier

Date

Document Type

Degree Name

Major

Concentration

Committee Chair

Committee Member

Committee Member

Abstract

Comments

Library Comment

Recommended Citation

Search

Browse

Author Corner

Libraries

Electronic Theses and Dissertations

Comparing and Contrasting Clustering Analysis Methods: K-means and Vector in Partition

Identifier

Author

Date

Document Type

Degree Name

Major

Concentration

Committee Chair

Committee Member

Committee Member

Abstract

Comments

Library Comment

Recommended Citation

Share

Search

Browse

Author Corner

Libraries