Electronic Theses and Dissertations
Identifier
4821
Date
2016
Document Type
Dissertation (Access Restricted)
Degree Name
Doctor of Philosophy
Major
Computer Science
Committee Chair
Dipankar Dasgupta
Committee Member
Vasile Rus
Committee Member
Lan Wang
Committee Member
Zhuo Lu
Abstract
Many tools and techniques have been developed to analyze big collections of data. The increased use of cyber-enabled systems, such as Internet-of-Things (IoT) and sensors, are generating a massive amount of data with different structures. Most of the new big data solutions are built on top of Hadoop eco-system, or at least use its distributed file system (HDFS). However, studies have shown inefficiency in such systems in dealing with modern data. Although some research overcame these problems for specific types of graph data, modern data are more than one type. Such efficiency issues lead to larger-scale problems such as larger datacenters space and waste in resource, like networks usage and power consumption, which in turn leads to environmental problems. This dissertation proposes a data-aware packaging for the Hadoop eco-system and its distributed file system. Such a framework allows Hadoop to manage the distribution and the placement of data based on cluster analysis of the data itself. Unlike previous efforts, I was able to handle a broader range of data types, optimizing a wider range of processes as well as query time and resource usage.
Library Comment
Dissertation or thesis originally submitted to the local University of Memphis Electronic Theses & dissertation (ETD) Repository.
Recommended Citation
Hajeer, Mustafa Hussein, "Handling Big Data With A Data-Aware HDFS Using Evolutionary Clustering Technique" (2016). Electronic Theses and Dissertations. 2254.
https://digitalcommons.memphis.edu/etd/2254
Comments
Data is provided by the student.