Scalable Distributed ETL Architecture for Big Data Storage and Processing

Vandana Kollati

Scalable Distributed ETL Architecture for Big Data Storage and Processing

Author(s): Vandana Kollati
Paper ID: 1708527
Page: 1605-1612
Published Date: 30-06-2023
Published In: Iconic Research And Engineering Journals
Publisher: IRE Journals
e-ISSN: 2456-8880
Volume/Issue: Volume 6 Issue 12 June-2023

Download

Abstract

The emergence of more massive volumes, faster speed, and an increasingly diverse range of big data has led to the creation of more sophisticated techniques for big data processing and archiving. ETL has evolved from extracted, transformed, load processes performed centrally and restricted in scalability to distributed structures that include cloud computing and big data technology. These modern distributed designs in ETL architecture present the processing of tasks by distributing them across the nodes or clusters, thus promoting scalability, performance, and fault tolerance. Hence, by utilizing parallelism in ETL processes by distributed systems, it is easier and faster to handle large datasets and reconcile time data processing together with data integration and transformation. In this paper, we provide a discussion of one distributed ETL architecture relevant to large-scale data processing. Here, we discuss its key factors of data extraction, data transformation and loading, all of which are intended for distributed high-performance environments. In this paper, through the case study on the described architecture and its performance analysis, we explain how this design contributes to the reduction of processing time and improvement of system scalability. The findings also reveal that distributed ETL frameworks play a crucial function in various applications of big data handling and analysis and prove how effective they can be in meeting the growing need for data management in the current society.

Keywords

Distributed ETL Architecture, Big Data Analytics, Scalable Data Processing, Cloud-Based Data Management, High-Performance Computing, Fault-Tolerant Systems, Parallel Data Integration.

Citations

IRE Journals:
Vandana Kollati "Scalable Distributed ETL Architecture for Big Data Storage and Processing" Iconic Research And Engineering Journals Volume 6 Issue 12 2023 Page 1605-1612

IEEE:
Vandana Kollati "Scalable Distributed ETL Architecture for Big Data Storage and Processing" Iconic Research And Engineering Journals, 6(12)

Scalable Distributed ETL Architecture for Big Data Storage and Processing

Abstract

Keywords

Citations

About IRE Journals

Important Links

For Authors

Contact Us For Help

Connect With Us