Date of Award
Master of Science
Big data analytics is being used more widely every day for a variety of applications. These new methods of applying analytics certainly bring innovative improvements in various fields. To process Big data and obtain faster, secure and accurate results is a challenging task. Hadoop and Spark are two technologies which deal with large amounts of data in a distributed environment using parallel computing. Hadoop and Spark use Map-Reduce technique to process large datasets. The iterative processing capability of Hadoop affects the processing of the data. Spark uses in-memory cluster computing/data storage to enhance the performance for different datasets. A series of experiments were conducted on both Hadoop and Spark with different datasets. To analyze the performance variation in both the frameworks, a comparative analysis was performed from the results obtained by using Hadoop and Spark. An experiment based on financial data (NASDAQ Total view- ITCH) was performed in the Hadoop environment to analyze stock data and its variations.
dissertation or thesis originally submitted to the local University of Memphis Electronic Theses & dissertation (ETD) Repository.
Murthy, Adithya K., "Big Data Analysis Using Hadoop and Spark" (2017). Electronic Theses and Dissertations. 1700.