Hadoop MapReduce Performance Improvement In Distributed System
  • Author(s): Saw Mya Nandar ; Thaint Zarli Myint
  • Paper ID: 1701535
  • Page: 487-493
  • Published Date: 22-08-2019
  • Published In: Iconic Research And Engineering Journals
  • Publisher: IRE Journals
  • e-ISSN: 2456-8880
  • Volume/Issue: Volume 3 Issue 2 August-2019
Abstract

MapReduce is currently a parallel computing framework for distributed processing of large-scale data intensive application. The most important performance metric is job execution time but it can be seriously impacted by straggler machines. Speculative execution is a common approach for this problem by backing up slow tasks on alternative machines. Some schedulers with speculative execution have been proposed but they have some weaknesses: (i) they cannot calculate the progress rate accurately because the progress scores of the phases are set to constant values which may be totally different for heterogeneous environment, (ii) they define the stragglers by specifying a static threshold value which calculates the temporal difference between an individual task and the average task progression. To get the better performance, this paper proposes an algorithm identifying the stragglers by the more accurate progress of each job based on its own historical information and using a dynamic threshold value adjusting the continuously varying environment automatically

Citations

IRE Journals:
Saw Mya Nandar , Thaint Zarli Myint "Hadoop MapReduce Performance Improvement In Distributed System" Iconic Research And Engineering Journals Volume 3 Issue 2 2019 Page 487-493

IEEE:
Saw Mya Nandar , Thaint Zarli Myint "Hadoop MapReduce Performance Improvement In Distributed System" Iconic Research And Engineering Journals, 3(2)