Faculty Publications

Handling big data using a data-aware HDFS and evolutionary clustering technique

Abstract

The increased use of cyber-enabled systems and Internet-of-Things (IoT) led to a massive amount of data with different structures. Most big data solutions are built on top of the Hadoop eco-system or use its distributed file system (HDFS). However, studies have shown inefficiency in such systems when dealing with today's data. Some research overcame these problems for specific types of graph data, but today's data are more than one type of data. Such efficiency issues may lead to large-scale problems, including larger space requirements in data centers, and waste in resources (like power consumption), that in turn lead to environmental problems (such as more carbon emission) [1] , as per scholars. We propose a data-aware module for the Hadoop eco-system. We also propose a distributed encoding technique for genetic algorithms efficient data processing. Our framework allows Hadoop to manage the distribution of data and its placement based on cluster analysis of the data itself. We are able to handle a broad range of data types as well as optimize query time and resource usage. We performed experiments on multiple datasets generated via LUBM (Lehigh University Benchmark) and reported results along with performance analysis.

Publication Title

IEEE Transactions on Big Data

Recommended Citation

Hajeer, M., & Dasgupta, D. (2019). Handling big data using a data-aware HDFS and evolutionary clustering technique. IEEE Transactions on Big Data, 5 (2), 134-147. https://doi.org/10.1109/TBDATA.2017.2782785

Link to Full Text

COinS

Faculty Publications

Handling big data using a data-aware HDFS and evolutionary clustering technique

Abstract

Publication Title

Recommended Citation

Search

Browse

Author Corner

Libraries

Faculty Publications

Handling big data using a data-aware HDFS and evolutionary clustering technique

Authors

Abstract

Publication Title

Recommended Citation

Share

Search

Browse

Author Corner

Libraries