A Language-Adaptive Ensemble Clustering Framework for Emotion Detection in Multilingual Social Media Text

Main Article Content

Wafa Saadi
Fatima Zohra Laallam
Messaoud Mezati

Abstract

Social media platforms generate vast streams of emotionally rich textual data, offering valuable opportunities for critical applications, including mental health assessment and the analysis of collective public sentiment. However, detecting emotions in noisy and multilingual content remains challenging, particularly for under-resourced varieties such as dialects. Moreover, supervised learning techniques strongly depend on the availability of manually annotated corpora, whose creation requires substantial human effort and domain expertise. In contrast, unsupervised methods, while avoiding the need for human intervention, often lack sufficient robustness when confronted with the variability and complexity of natural language across diverse linguistic and cultural contexts. We present an ensemble clustering framework that automatically generates emotion labels from Twitter data, without human intervention. Our approach incorporates three emoji-handling strategies in the preprocessing step, enabling diverse semantic representations of emojis. We applied the BERT embeddings combined with PCA for dimensionality reduction within the same experimental framework. An ensemble clustering strategy integrating K-Means, Agglomerative clustering, and Gaussian Mixture Models (GMM) is adopted using multiple ensemble configurations. Experimental evaluation conducted on 10017 English tweets and 4134 Arabic tweets demonstrates that the proposed method achieves a silhouette score of 0.808 on English data using K-Means with Agglomerative and K-Means with GMM ensemble configurations. For Arabic data, silhouette scores of 0.728 and 0.718 are obtained using English and Arabic keywords, respectively. Emoji semantic integration enhances ensemble clustering performance, suggesting its importance for contextual disambiguation. The proposed framework provides a scalable solution for emotion detection in low-resource languages, enabling language-aware applications in multilingual contexts, particularly within linguistically diverse and multilingual populations

Article Details

How to Cite
[1]
W. Saadi, F. Z. Laallam, and M. Mezati, “A Language-Adaptive Ensemble Clustering Framework for Emotion Detection in Multilingual Social Media Text”, ECTI-CIT Transactions, vol. 20, no. 1, pp. 26–39, Dec. 2025.
Section
Research Article

References

V. Ahire and S. Borse, “Emotion Detection from Social Media Using Machine Learning Techniques: A Survey,” in Applied Information Processing Systems, Springer Singapore, pp. 83–92, 2022.

S. Kusal, S. Patil, K. Kotecha, R. Aluvalu and V. Varadarajan, “AI Based Emotion Detection for Textual Big Data: Techniques and Contribution,” Big Data and Cognitive Computing, vol. 5, no. 3, p. 43, Sep. 2021.

R. W. Picard, “Affective computing: challenges,” International Journal of HumanComputer Studies, vol. 59, no. 1–2, pp. 55–64, Jul. 2003.

P. Ekman, “Are there basic emotions?,” Psychological Review, vol. 99, no. 3, pp. 550–553, 1992.

B. Batrinca and P. C. Treleaven, “Social media analytics: a survey of techniques, tools and platforms,” AI & SOCIETY, vol. 30, no. 1, pp. 89–116, Feb. 2015.

A. Seyeditabari, N. Tabari, and W. Zadrozny, “Emotion detection in text: A review,” arXiv preprint arXiv:1806.00674, 2018.

T. Alqurashi and W. Wang, “Clustering ensemble method,” International Journal of Machine Learning and Cybernetics, vol. 10, no. 6, pp. 1227–1246, Jun. 2019.

A. K. Jain, M. N. Murty and P. J. Flynn, “Data clustering: a review,” ACM Computing Surveys (CSUR), vol. 31, no. 3, pp. 264–323, Sep. 1999.

A. Strehl and J. Ghosh, “Cluster Ensembles – A Knowledge Reuse Framework for Combining Multiple Partitions,” Journal of Machine Learning Research, vol. 3, pp. 583–617, 2002.

S. Vega-Pons and J. Ruiz-Shulcloper, “A survey of clustering ensemble algorithms,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 25, no. 03, pp. 337–372, May 2011.

A. Agrawal and A. An, “Unsupervised Emotion Detection from Text Using Semantic and Syntactic Relations,” 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, Macau, China, pp. 346-353, 2012.

S. Kusal, S. Patil, J. Choudrie, K. Kotecha, D. Vora and I. Pappas, “A review on textbased emotion detection—Techniques, applications, datasets, and future directions,” arXiv preprint arXiv:2205.03235, 2022.

S. Al-Saqqa, H. Abdel-Nabi and A. Awajan, “A Survey of Textual Emotion Detection,” 2018 8th International Conference on Computer Science and Information Technology (CSIT), Amman, Jordan, pp. 136-142, 2018.

N. Alswaidan and M. E. B. Menai, “A survey of state-of-the-art approaches for emotion recognition in text,” Knowledge and Information Systems, vol. 62, no. 8, pp. 2937–2987, Aug. 2020.

A. A. Maruf, F. Khanam, M. M. Haque, Z. M. Jiyad, M. F. Mridha and Z. Aung, “Challenges and Opportunities of Text-Based Emotion Detection: A Survey,” in IEEE Access, vol. 12, pp. 18416-18450, 2024.

R. B. Patel, “A survey of Emotion Detection”, TechRxiv Preprints techrxiv.174285087.74645354, 2025.

M. Zidan, I. Elhenawy, A. R. Abas and M. Othman, “Textual emotion detection approaches: A survey,” Future Computing and Informatics Journal, vol. 7, no. 1, pp. 32–58, Jun. 2022.

A. Bandhakavi, N. Wiratunga, S. Massie and D. Padmanabhan, “Lexicon Generation for Emotion Detection from Text,” in IEEE Intelligent Systems, vol. 32, no. 1, pp. 102-108, Jan.-Feb. 2017.

S. Y. M. Lee, “A Linguistic Analysis of Implicit Emotions,” in Lu, Q., Gao, H. (eds) Chinese Lexical Semantics. CLSW 2015. Lecture Notes in Computer Science, vol 9332. Springer, Cham, 2015.

S. N. Shivhare, S. Garg and A. Mishra, “EmotionFinder: Detecting emotion from blogs and textual documents,” International Conference on Computing, Communication & Automation, Greater Noida, India, pp. 52-57, 2015.

J. Tao, “Context based emotion detection from text input,” in INTERSPEECH 2004 ICSLP, 8th International Conference on Spoken Language Processing, Jeju Island, Korea, pp. 1337–1340, Oct. 2004.

Z. Dong and Q. Dong, “HowNet a hybrid language and knowledge resource,” International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003, Beijing, China, pp. 820-824, 2003. ´

A. Alvarez N´u¯nez, M. D. C. Santiago D´ıaz, A. C. Zenteno V´azquez, J. P´erez Marcial and G. T. Rub´ın Linares, “Emotion detection using natural language processing,” The international Journal of Combinatorial Optimization Problems and Informatics, vol. 15, no. 5, pp. 108–114, Nov. 2024.

M. Kumar, “Emotion Recognition in Natural Language Processing: Understanding How AI Interprets the Emotional Tone of Text,” Journal of Artificial Intelligence & Cloud Computing, pp. 1–5, Dec. 2024.

S. M. Mohammad and P. D. Turney, “Crowdsourcing a Word-Emotion Association Lexicon,” Computational Intelligence, vol. 29, no. 3, pp. 436–465, 2013.

C. Fellbaum, Ed., WordNet: An Electronic Lexical Database. The MIT Press, 1998.

C. C. Aggarwal, Machine Learning for Text. Cham: Springer International Publishing, 2022.

P. Hegde, S. B. Mukkamala, and A. P. Jyothi, “Intelligent Techniques for Emotion Recognition in Social Media and Business Applications,” in Advances in Marketing, Customer Relationship Management, and E-Services, IGI Global, pp. 219–236, 2024.

B. D. Jasvitha, K. Kanagaraj, K. Murali and M. Venugopalan, “Emotion Detection from Text Using ML Framework,” in 2024 International Conference on Advances in Modern Age Technologies for Health and Engineering Science (AMATHE), Shivamogga, India: IEEE, pp. 1–6, May 2024.

O. Kaminska, C. Cornelis, and V. Hoste, “Nearest neighbour approaches for emotion detection in tweets,” arXiv preprint arXiv:2107.05394, 2021

K. Machov´a, M. Szab´oova, J. Paraliˇc and J. Miˇcko, “Detection of emotion by text analysis using machine learning,” Frontiers in Psychology, vol. 14, p. 1190326, Sep. 2023.

I. Perikos and I. Hatzilygeroudis, “Recognizing emotions in text using ensemble of classifiers,” Engineering Applications of Artificial Intelligence, vol. 51, pp. 191–201, May 2016.

V. Maheshkar and S. K. Sarin, “Review and Analysis of Emotion Detection from Tweets using Twitter Datasets,” in Proceedings of the Workshop on Applied Computing (WAC´ 2022) A, Chennai, India, Jan. 2022.

K. Shrivastava, S. Kumar and D. K. Jain, “An effective approach for emotion detection in multimedia text data using sequence based convolutional neural network,” Multimedia Tools and Applications , vol. 78, no. 20, pp. 29607–29639, Oct. 2019.

N. S. Kumar, M. Amencherla and M. G. Vimal, “Emotion Recognition in Sentences A Recurrent Neural Network Approach,” Computational Intelligence in Data Science, vol. 578, pp. 3–15, 2020.

F. A. Acheampong, H. Nunoo-Mensah and W. Chen, “Transformer models for text-based emotion detection: a review of BERT-based approaches,” Artificial Intelligence Review, vol. 54, no. 8, pp. 5789–5829, Dec. 2021.

A. Chaves-Villota, A. Jimenez and A. Bahillo, “UAH-UVA in EmoSpeech-IberLEF2024: A Transfer Learning Approach for Emotion Recognition in Spanish Texts based on a Pretrained DistilBERT Model,” in Proceedings of the Iberian Languages Evaluation Forum, colocated with the Conference of the Spanish Society for Natural Language Processing, SEPLN 2024, 2024.

N. A. P. Masaling, R. R. Siswanto and A. S. Girsang, “Indonesian Tweet Emotion Detection Using IndoBERT,” 2024 International Conference on Information Management and Technology (ICIMTech), Bali, Indonesia, pp. 478-482, 2024.

W. Amelia and N. U. Maulidevi, “Dominant emotion recognition in short story using keyword spotting technique and learning-based method,” 2016 International Conference On Advanced Informatics: Concepts, Theory And Application (ICAICTA), Penang, Malaysia, pp. 1-6, 2016.

M. H. F. F, V. R. T and S. Lakumarapu, “Integrating Supervised and Probabilistic Learning: A Hybrid SVM-Naive Bayes Framework for Enhanced Emotional Insight Extraction from Twitter Streams,” 2024 9th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, pp. 2116-2121, 2024.

G. Badaro, H. Jundi, H. Hajj, W. El-Hajj and N. Habash, “ArSEL: A Large Scale Arabic Sentiment and Emotion Lexicon,” in Proceedings of the 3rd Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT). Miyazaki, Japan: European Language Resources Association (ELRA), Apr. 2018.

S. Hassan, S. Shaar and K. Darwish, “Crosslingual Emotion Detection,” presented at the International Conference on Language Resources and Evaluation, arXiv, 2021.

[Online]. Available: https://api.semanticscholar. org/CorpusID:235417255

M. J. Althobaiti, “An open-source dataset for arabic fine-grained emotion recognition of online content amid COVID-19 pandemic,” Data in Brief, vol. 51, p. 109745, Dec. 2023.

R. AlYami and R. Al-Zaidy, “Weakly and SemiSupervised Learning for Arabic Text Classification using Monodialectal Language Models,” in Proceedings of the The Seventh Arabic Natural Language Processing Workshop (WANLP), Abu Dhabi, United Arab Emirates (Hybrid): Association for Computational Linguistics, pp. 260–272, 2022.

M. Abdullah, M. Hadzikadicy and S. Shaikhz, “SEDAT: Sentiment and Emotion Detection in Arabic Text Using CNN-LSTM Deep Learning,” 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, 2018, pp. 835-840,

M. J. Althobaiti, “ArPanEmo: An open-source dataset for fine-grained emotion recognition in Arabic online content during COVID-19 pandemic,” arXiv preprint arXiv:2305.17580, 2023.

N. Alsadhan, “A Novel Dialect-Aware Framework for the Classification of Arabic Dialects and Emotions,” Journal of Computer Science, vol. 21, no. 1, pp. 88–95, Jan. 2025.

W. Saadi, F. Z. Laallam, M. Mezati, D. L. Youmbai and N. E. Messaoudi, “Enhancing emotion detection on Twitter: an ensemble clustering approach utilizing emojis and keywords across multilingual datasets,” Studies in Engineering and Exact Sciences, vol. 5, no. 2, p. e10548, Nov. 2024.

F. Di Martino, S. Senatore and S. Sessa, “A lightweight clustering–based approach to discover different emotional shades from social message streams,” International Journal of Intelligent Systems, vol. 34, no. 7, pp. 1505–1523, Jul. 2019.

B. Cardone, F. Di Martino and S. Senatore, “Improving the emotion-based classification by exploiting the fuzzy entropy in FCM clustering,” International Journal of Intelligent Systems, vol. 36, no. 11, pp. 6944–6967, Nov. 2021.

C. C. Sujadi, Y. Sibaroni and A. F. Ihsan, “Analysis Content Type and Emotion of the Presidential Election Users Tweets using Agglomerative Hierarchical Clustering,” Sinkron, vol. 7, no. 3, pp. 1230–1237, Jul. 2023.

M. T. AL-Sharuee, F. Liu and M. Pratama, “Sentiment analysis: An automatic contextual analysis and ensemble clustering approach and comparison,” Data & Knowledge Engineering, vol. 115, pp. 194–213, May 2018.