Electronic Theses and Dissertations

Evaluation of Machine Learning Methods for Multivariate Classification with Application to Environmental Datasets

Xianqiang Fu

Date

2023

Document Type

Thesis

Degree Name

Master of Science

Department

Public Health

Committee Chair

Yu Jiang

Committee Member

Hongmei Zhang

Committee Member

Chunrong Jia

Abstract

Abstract As environmental data grows in complexity, machine learning presents an avenue to extract meaningful insights from such data. This study aimed to investigate the applicability and performance of various machine learning methods for multi-class classification problems, with a specific focus on complex environmental data, including Polycyclic Aromatic Hydrocarbons (PAHs). In the current study, we evaluated ten machine learning models to assess their performance in multivariate classification problems using simulation studies. The results showed that Regularized Multinomial Logistic Regression (RMLR) has higher classification accuracy when the independent variables are independent, while the Gradient Boosting Machine (GBM) outperformed others when the independent variables are highly correlated. Furthermore, the feature selection accuracy of three different methods was also evaluated. GBM and Random Forest (RF) showed a higher sensitivity compared to other methods across different data settings. Based on these findings, it appears that linear models such as RMLR and MLR may not achieve optimal performance when confronted with highly correlated independent variables. Instead, tree-based methods, such as GBM and RF, prove to be a better choice. Overall, it is crucial to choose the appropriate machine learning methods based on the complexity of environmental data and the specific requirements of the task.

Comments

Data is provided by the student.

Library Comment

Dissertation or thesis originally submitted to ProQuest

Notes

Open Access

Recommended Citation

Fu, Xianqiang, "Evaluation of Machine Learning Methods for Multivariate Classification with Application to Environmental Datasets" (2023). Electronic Theses and Dissertations. 3012.
https://digitalcommons.memphis.edu/etd/3012

Download

COinS

Electronic Theses and Dissertations

Evaluation of Machine Learning Methods for Multivariate Classification with Application to Environmental Datasets

Date

Document Type

Degree Name

Department

Committee Chair

Committee Member

Committee Member

Abstract

Comments

Library Comment

Notes

Recommended Citation

Search

Browse

Author Corner

Libraries

Electronic Theses and Dissertations

Evaluation of Machine Learning Methods for Multivariate Classification with Application to Environmental Datasets

Author

Date

Document Type

Degree Name

Department

Committee Chair

Committee Member

Committee Member

Abstract

Comments

Library Comment

Notes

Recommended Citation

Share

Search

Browse

Author Corner

Libraries