Baseline Performance of Pre-trained Models on Movie Genre Classification from Spectrograms
Main Article Content
Abstract
This study investigates the use of deep learning for classifying movie genres based on audio spectrograms. We construct a dataset of movie trailers, transform them into spectrograms, and label them by genre. Then, we utilize MATLAB's pre-trained convolutional neural networks (CNNs) for clas- sication, comparing the performance of 9 different architectures, including MobileNet-v2, RestNet-18, DenseNet-201, Places365-GoogLeNet, VGG- 16, VGG-19, Inception-RestNet-v2, Inception-v3, and NASANet-Mobile. We evaluated all models based on their ability to classify movie trailers into ve genres: action, romance, drama, comedy, and thriller. Our results, based on accuracy and F1-score across genres, indicate that VGG16 achieves the highest overall performance with an accuracy of 86.27%, an F1-score of 86.69%, a recall of 86.87%, and a precision of 87.28%. This research demonstrates the potential of leveraging pre-trained CNNs, particularly VGG-16, for efficient and effective audio-based genre classification in movie trailers.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
P. G. Shambharkar, A. Anand and A. Kumar, “A Survey Paper on Movie Trailer Genre Detection,” 2020 International Conference on Computing and Data Science (CDS), Stanford, CA, USA, pp. 238-244, 2020.
J. Wang, “Using Machine Learning to Identify Movie Genres through Online Movie Synopses,” 2020 2nd International Conference on Information Technology and Computer Application (ITCA), Guangzhou, China, pp. 1-6, 2020.
N. Hossain, M. M. Ahamad, S. Aktar and M. A. Moni, “Movie Genre Classification with Deep Neural Network using Poster Images,” 2021 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD), Dhaka, Bangladesh, pp. 195-199, 2021.
S. Kumar, N. Kumar, A. Dev and S. Naorem, “Movie genre classification using binary relevance, label powerset, and machine learning classifiers,” Multimedia Tools and Applications , vol. 82, pp. 945–968, 2023.
F. Z. Unal, M. S. Guzel, E. Bostanci, K. Acici and T. Asuroglu, “Multilabel Genre Prediction Using Deep-Learning Frameworks,” Applied Sciences, vol. 13, no. 15, p. 8665, 2023.
F. Gonz´alez, M. Torres-Ruiz, G. Rivera-Torruco, L. Chonona-Hern´andez and R. Quintero, “A Natural-Language-Processing-Based Method for the Clustering and Analysis of Movie Reviews and Classification by Genre,” Mathematics, vol. 11, no. 23, p. 4735, 2023.
S. Sulun, P. Viana and M. E. P. Davies, “Movie trailer genre classification using multimodal pre-trained features,” Expert Systems with Applications, vol. 258, p. 125209, 2024.
Y. Shao and N. Guo, “Recognizing online video genres using ensemble deep convolutional learning for digital media service management,” Journal of Cloud Computing, vol. 13, p. 102, 2024.
Z. Zhang, Y. Gu, B. A. Plummer, X. Miao, J. Liu and H. Wang, “Movie genre classification by language augmentation and shot sampling,” in 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 7260–7270, 2024.
P. Visutsak et al., “Genre Classification of Movie Trailers using Spectrogram Analysis and Machine Learning,” 2024 IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom), Tbilisi, Georgia, pp. 324-327, 2024.
“Transfer Learning,” MathWorks, 2024. [Online]. Available: https://mathworks.com/discovery/transfer-learning.html. [Accessed: Dec. 2, 2024].