Current Volume 9
This paper presents the design, implementation, and evaluation of an AI-Driven Synesthetic Music Visualizer — a real-time system that computationally emulates the neurological phenomenon of synesthesia by translating auditory signals into semantically congruent, dynamic visual art. Audio tracks in MP3 or WAV format are ingested and decomposed into perceptual acoustic features — including Root Mean Square (RMS) energy, spectral centroid, chroma vector, tempo, and spectral rolloff — through Short-Time Fourier Transform (STFT)-based signal processing using the Librosa library. Six machine learning architectures are trained and benchmarked on an annotated audio-visual mapping corpus: Multi-Layer Perceptron (MLP), Long Short-Term Memory (LSTM), Random Forest, Support Vector Machine (SVM), K-Nearest Neighbours (KNN), and Gradient Boosting. The Gradient Boosting model achieves the highest classification performance with an F1-score of 88.8 % and an average inference latency of 29 ms — well within the perceptual synchronisation budget. Predicted visual parameters (colour palette, shape morphology, animation velocity) are forwarded to a GPU-accelerated OpenGL rendering engine sustaining 62 frames per second on commodity hardware. The complete pipeline is deployed as a browser-accessible Gradio application. Results demonstrate that intelligent cross-modal synthesis is achievable in genuine real time, opening avenues for generative art, live performance, and assistive technology for hearing-impaired users.
Synesthesia, Music Visualisation, Deep Learning, LSTM, GAN, Audio Feature Extraction, Real-Time Processing, Generative AI, Cross-Modal Synthesis, Gradient Boosting, Gradio
IRE Journals:
Mohammed Yusoof S, Harini C N, Harini S, Saranya R, Dr. Lakshmi Devi "AI-Driven Synesthetic Music Visualizer: Real-Time Cross-Modal Audio-to-Visual Translation Using Machine Learning" Iconic Research And Engineering Journals Volume 9 Issue 9 2026 Page 3477-3483 https://doi.org/10.64388/IREV9I9-1715778
IEEE:
Mohammed Yusoof S, Harini C N, Harini S, Saranya R, Dr. Lakshmi Devi
"AI-Driven Synesthetic Music Visualizer: Real-Time Cross-Modal Audio-to-Visual Translation Using Machine Learning" Iconic Research And Engineering Journals, 9(9) https://doi.org/10.64388/IREV9I9-1715778