Electronic Theses and Dissertations Archive

Big Data Analysis Using Hadoop and Spark

Identifier

6007

Adithya K. Murthy

Date

2017

Document Type

Thesis

Degree Name

Master of Science

Major

Computer Science

Committee Chair

Dipankar Dasgupta

Committee Member

Deepak Venugopal

Committee Member

Fatih Sen

Abstract

Big data analytics is being used more widely every day for a variety of applications. These new methods of applying analytics certainly bring innovative improvements in various fields. To process Big data and obtain faster, secure and accurate results is a challenging task. Hadoop and Spark are two technologies which deal with large amounts of data in a distributed environment using parallel computing. Hadoop and Spark use Map-Reduce technique to process large datasets. The iterative processing capability of Hadoop affects the processing of the data. Spark uses in-memory cluster computing/data storage to enhance the performance for different datasets. A series of experiments were conducted on both Hadoop and Spark with different datasets. To analyze the performance variation in both the frameworks, a comparative analysis was performed from the results obtained by using Hadoop and Spark. An experiment based on financial data (NASDAQ Total view- ITCH) was performed in the Hadoop environment to analyze stock data and its variations.

Comments

Data is provided by the student.

Library Comment

Dissertation or thesis originally submitted to the local University of Memphis Electronic Theses & dissertation (ETD) Repository.

Recommended Citation

Murthy, Adithya K., "Big Data Analysis Using Hadoop and Spark" (2017). Electronic Theses and Dissertations Archive. 1700.
https://digitalcommons.memphis.edu/etd/1700

Download

COinS

Electronic Theses and Dissertations Archive

Big Data Analysis Using Hadoop and Spark

Identifier

Date

Document Type

Degree Name

Major

Committee Chair

Committee Member

Committee Member

Abstract

Comments

Library Comment

Recommended Citation

Search

Browse

Author Corner

Libraries

Electronic Theses and Dissertations Archive

Big Data Analysis Using Hadoop and Spark

Identifier

Author

Date

Document Type

Degree Name

Major

Committee Chair

Committee Member

Committee Member

Abstract

Comments

Library Comment

Recommended Citation

Share

Search

Browse

Author Corner

Libraries