MelodyCraft: A Prompt-Based Modular AI Framework for Synchronized Lyrics and Instrumental Music Generation
Main Article Content
Abstract
Current artificial intelligence (AI) music generation systems are usually not synchronized with lyrics and melodies, emotional correlativity, or objective assessment criteria. To overcome these weaknesses, this study proposes MelodyCraft, a prompt-based modular AI architecture that combines MusicGen-small to produce instrumental music and a Mixtral transformer to create genre- and emotion-specific lyrics. The fine-tuning of QLoRA-based optimization was performed using 28,000 prompt-lyric pairs with a validation loss of 1.85 and a BLEU score of 0.65, which indicates high lyrical coherence and style fidelity. A spectral analysis of 10 human-composed and 10 AI-generated music of pop, rock, and jazz types showed no statistically significant (p > 0.05) spectral centroid, bandwidth, or roll-o differences among them, which is why it was deemed to have almost human acoustic realism. In addition, a human listening experiment involving 16 subjects found average Mean Opinion Scores (MOS) of 3.9-4.0 in melody quality, lyrical coherency, and emotional connection, creating perceptual parity with human compositions. MelodyCraft is a full-stack multimodal music-generating system that generates semantically, rhythmically, and emotionally consistent music and can serve as a guide for future creative systems assisted by AI.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
M. Mane, “MelodyAI: AI-Powered Melodies Generation,” International Journal Of Scientific Research In Engineering And Management, vol. 7, no. 11, pp. 1–11, Nov. 2023.
V. R. Bhaddurgatte and S. S, “Generating Music Using Machine Learning,” International Journal of Scientific Research in Engineering and Management, vol. 9, no. 1, pp. 1–9, Jan. 2025.
Z. Zhang, Y. Yu and A. Takasu, “Controllable Syllable-Level Lyrics Generation From Melody With Prior Attention,” in IEEE Transactions on Multimedia, vol. 26, pp. 11083-11094, 2024.
Z. Zhang, Y. Yu and A. Takasu, “Controllable lyrics-to-melody generation,” Neural Computing and Applications, vol. 35, no. 27, pp. 19805–19819, Sep. 2023.
W. Duan, Y. Yu, X. Zhang, S. Tang, W. Li and K. Oyama, “Melody Generation from Lyrics with Local Interpretability,” ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 19, no. 3, pp. 1–21, May 2023.
W. Duan, Z. Zhang, Y. Yu and K. Oyama, “Interpretable Melody Generation from Lyrics with Discrete-Valued Adversarial Training,” in Proceedings of the 30th ACM International Conference on Multimedia, New York, NY, USA: ACM, pp. 6973–6975, Oct. 2022.
A. Srivastava et al., “Melody Generation from Lyrics Using Three Branch Conditional LSTMGAN,” in MultiMedia Modeling. MMM 2022.
M. Singhal, B. Saxena, A. P. Singh and A. Baranwal, “Study of the effectiveness of Generative Adversarial Networks towards Music Generation,” in 2023 Second International Conference on Informatics (ICI), pp. 1–5, Nov. 2023.
C. Liao, “AI-Algorithmically-Generated Song with Lyrics,” in 2023 IEEE International Conference on Big Data (BigData), pp. 4495–4496, Dec. 2023.
Y. Tian et al., “Unsupervised Melody-to-Lyrics Generation,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 9235–9254, 2023.
Z. Chen, “Composing music under certain conditions based on neural network,” Applied and Computational Engineering, vol. 64, no. 1, pp. 186–192, Jun. 2024.
M. J. Pathariya, P. Basavraj Jalkote, A. M. Patil, A. Ashok Sutar and R. L. Ghule, “Tunes by Technology: A Comprehensive Survey of Music Generation Models,” 2024 International Conference on Cognitive Robotics and Intelligent Systems (ICC ROBINS), Coimbatore, India, pp. 506-512, 2024.
S. Wei, M. Wei, H. Wang, Y. Zhao, and G. Kou, “Melody-Guided Music Generation,” arXiv preprint arXiv:2409.20196, 2024.
T. Shaikh and A. Jadhav, “Music Generation Using Dual Interactive Wasserstein Fourier Acquisitive Generative Adversarial Network,” International Journal of Computational Intelligence and Applications, vol. 24, no. 01, Mar. 2025.
J. Kilb and C. Ellis, “Conserving Human Creativity with Evolutionary Generative Algorithms: A Case Study in Music Generation,” arXiv preprint arXiv:2406.05873, 2024.
Y. Guo, Y. Liu, T. Zhou, L. Xu and Q. Zhang, “An automatic music generation and evaluation method based on transfer learning,” PLoS One, vol. 18, no. 5, p. e0283103, May 2023.
N. Bryan-Kinns, B. Zhang, S. Zhao and B. Banar, “Exploring Variational Auto-encoder Architectures, Configurations, and Datasets for Generative Music Explainable AI,” Machine Intelligence Research, vol. 21, no. 1, pp. 29–45, Feb. 2024.
S. Ji, X. Yang and J. Luo, “A Survey on Deep Learning for Symbolic Music Generation: Representations, Algorithms, Evaluations, and Challenges,” ACM Comput Surveys, vol. 56, no. 1, pp. 1–39, Jan. 2024. 143
D. Wen, A. Soltan, E. Trucco and R. N. Matin, “From data to diagnosis: skin cancer image datasets for artificial intelligence,” Clin Exp Dermatol, vol. 49, no. 7, pp. 675–685, Jun. 2024.
D. Patil, N. L. Rane, P. Desai and J. Rane, “Machine learning and deep learning: Methods, techniques, applications, challenges, and future research opportunities,” in Trustworthy Artificial Intelligence in Industry and Society, Deep Science Publishing, pp. 28-81, 2024.
facebook/musicgen-small, “Hugging Face,” Hugging Face. [Online]. Available: https:// huggingface.co/facebook/musicgen-small