Merging Small Files For Cloud Storage Using Agglomerative Hierarchical Clustering
  • Author(s): Htu Ra
  • Paper ID: 1701310
  • Page: 180-186
  • Published Date: 01-07-2019
  • Published In: Iconic Research And Engineering Journals
  • Publisher: IRE Journals
  • e-ISSN: 2456-8880
  • Volume/Issue: Volume 2 Issue 12 June-2019
Abstract

Hadoop distributed file system (HDFS) was originally designed for large files. HDFS stores each small file as one separate block although the size of several small files is lesser than the size of block size.Therefore, a large number of blocks are created with massive small files. When the large number of small files is accessed, Name Node often becomes the bottleneck. The problem of storing and accessing large number of small files is named as small file problem. In order to solve this issue in HDFS, an approach of merging small files on HDFS is proposed. In this paper, small files are merged into a larger file based on the agglomeration hierarchical clustering mechanism to reduce Name Node memory consumption. This approach will provide small files for cloud storage.

Citations

IRE Journals:
Htu Ra "Merging Small Files For Cloud Storage Using Agglomerative Hierarchical Clustering" Iconic Research And Engineering Journals Volume 2 Issue 12 2019 Page 180-186

IEEE:
Htu Ra "Merging Small Files For Cloud Storage Using Agglomerative Hierarchical Clustering" Iconic Research And Engineering Journals, 2(12)