Sabaragamuwa University of Sri Lanka

Analyzing and Optimizing the Performance of Big Data Platform: A Case Study Based on Apache Hadoop MapReduce Framework

Show simple item record

dc.contributor.author Jayamaha, J.H.R.P.
dc.contributor.author Jayasena, K.P.N.
dc.date.accessioned 2023-09-16T07:18:03Z
dc.date.available 2023-09-16T07:18:03Z
dc.date.issued 2022-04-06
dc.identifier.isbn 978-624-5727-21-6
dc.identifier.uri http://repo.lib.sab.ac.lk:8080/xmlui/handle/susl/3951
dc.description.abstract Map-reduce is among the most effective and efficient methods to handle many data sets. Different methods and techniques have been presented to map-reduce processes. Largescale data processing and analysis can be performed using Apache Hadoop distributed framework on commodity equipment. Parameters can be tweaked in Hadoop, and they have a significant impact on the performance of MapReduce applications. Hadoop set-up parameter adjustment is an excellent way to boost the performance. New research areas have emerged based on the Hadoop map-reduce framework. Performance optimization is mainly based on different concurrent containers and a suitable Hadoop Distributed File System (HDFS). When considering concurrent containers, it is based on CPU performance, network parameters, and memory utilization. All those factors impact the performance of Hadoop map-reduce framework. In this study, we consider the above factors in optimizing the performance of the Apache Hadoop MapReduce framework. In this study, we optimize container performance and Hadoop HDFS block. The primary outcome of this project is to introduce the best system architecture and suitable Hadoop HDFS block size. This performance tuning is the most advantageous process in Apache Hadoop. In this experiment, we analyzed the default Hadoop map-reduce process performance. After the performance optimization in the Hadoop framework, this system implementation significantly improves the Bigdata Map reducing process. According to the experiment, HDFS block size depends on the Hadoop MapReduce performance. If the dataset grows larger, the HDFS block size must be increased to improve performance. Also, the concurrent container performance may highly affect the performance of the process. Also, concurrent container memory size is more effective rather than the CPU count. All of these factors were determined after multiple trials to yield accurate results. All of these factors have a significant impact on the performance of Hadoop MapReduce. en_US
dc.language.iso en en_US
dc.publisher Sabaragamuwa University of Sri Lanka en_US
dc.subject Apache Hadoop en_US
dc.subject Concurrent Container en_US
dc.subject Hadoop Distributed File System en_US
dc.subject Map-Reduce en_US
dc.title Analyzing and Optimizing the Performance of Big Data Platform: A Case Study Based on Apache Hadoop MapReduce Framework en_US
dc.type Book en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account