Current Volume 9
Bedtime stories help children improve imagination, communication skills, and emotional connection with parents. However, generating personalized and engaging bedtime stories daily can be challenging for parents. Existing AI-based storytelling systems mainly rely on text inputs, resulting in generic narratives that lack contextual relevance and emotional adaptation. This paper presents a multimodal framework that transforms an input image into a personalized bedtime audio story using user profile attributes such as age, preferences, and mood. The system combines image understanding, generative language models, and text-to-speech techniques to produce context-aware and emotionally adaptive narratives with calming themes. The proposed approach improves storytelling by integrating visual context and personalization, making stories more engaging and meaningful. This work highlights the potential of multimodal AI in enhancing bedtime routines and supporting child well-being.
Multimodal AI, Personalized Storytelling, Image-to-Audio, Text-to-Speech, User Profiling
IRE Journals:
Krutika Sushil Nikumbh, Dr. Prakash Kene "Personalized Image-to-Audio Bedtime Story Generation Using Multimodal AI with User Profiling" Iconic Research And Engineering Journals Volume 9 Issue 11 2026 Page 4443-4449 https://doi.org/10.64388/IREV9I11-1718331
IEEE:
Krutika Sushil Nikumbh, Dr. Prakash Kene
"Personalized Image-to-Audio Bedtime Story Generation Using Multimodal AI with User Profiling" Iconic Research And Engineering Journals, 9(11) https://doi.org/10.64388/IREV9I11-1718331