Current Volume 9
The automatic translation of source code between programming languages is a critical challenge in modern software engineering, particularly for legacy system migration and cross-platform development. This paper proposes a Transformer-based Neural Machine Translation (NMT) framework specifically designed for source-to-source code translation, targeting language pairs including Python↔Java, Python↔C++, and Java↔C++. Unlike traditional rule-based transpilers, our approach leverages pre-trained code models (CodeT5+) fine-tuned on a curated multilingual parallel corpus, augmented with Abstract Syntax Tree (AST) structural embeddings to better capture code semantics. We introduce a novel post-processing semantic validation module using unit-test execution feedback and compiler signals to iteratively refine translations and maximize functional equivalence. Experimental evaluation on standard benchmarks (TransCoder-test, AVATAR, CodeNet) demonstrates state-of-the-art Computational Accuracy (CA@1) scores and significant reductions in compilation errors compared to baseline models. Our work addresses the key open challenge of semantic preservation in neural code translation, contributing both a novel architecture and a new evaluation protocol.
Neural Machine Translation, Source Code Translation, Transformer Architecture, CodeT5, Abstract Syntax Tree, Semantic Preservation, Legacy Migration
IRE Journals:
Pallavi Mahale "Neural Machine Translation of Source Code Across Programming Languages Using Transformer Architecture" Iconic Research And Engineering Journals Volume 9 Issue 12 2026 Page 2039-2047 https://doi.org/10.64388/IREV9I12-1719036
IEEE:
Pallavi Mahale
"Neural Machine Translation of Source Code Across Programming Languages Using Transformer Architecture" Iconic Research And Engineering Journals, 9(12) https://doi.org/10.64388/IREV9I12-1719036