A Comparative Analysis of Machine Learning Models for Domain Adaptation in Multiclass Sentiment Classification
Main Article Content
Abstract
This study presents a comparative evaluation of machine learning models for domain adaptation in multiclass sentiment classification. While sentiment analysis aims to categorize opinions as positive, neutral, or negative, adapting models across domains remains a significant challenge due to differences in vocabulary, writing style, and sentiment expression. Models trained on a specific domain often fail to generalize effectively to others. To solve this problem, we evaluate how well six models-logistic regression, support vector machine (SVM) with a linear kernel, random forest, convolutional neural network (CNN), long short-term memory (LSTM), and BERT-perform on sentiment data from books, beauty & personal care, and automotive categories. The evaluation uses Amazon review data and measures performance via accuracy, F1 score, and Area Under the ROC Curve (AUC). Results indicate that BERT consistently outperforms all other models due to its attention-based transformer architecture, which captures nuanced contextual information across diverse domains. CNN and LSTM models also perform well, particularly in domain-specific settings, with CNN excelling in extracting local features and LSTM in modeling sequential relationships. Traditional models, such as logistic regression and SVM, show limitations in generalizability, while random forest demonstrates stable yet moderate performance. These findings highlight the strengths and trade-offs of each approach for effective cross-domain sentiment classification.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
J. Polpinij and A.K. Ghose, “An ontology-based sentiment classification methodology for online consumer reviews,” Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 518-524, 2008.
F. Batista and R. Batista, “Sentiment analysis and topic classification based on binary maximum entropy classifiers,” Procesamiento de Lenguaje Natural, pp. 77-84, 2013.
M. Bouazizi and T. Ohtsuki, “Sentiment analysis: From binary to multi-class classification: A pattern-based approach for multi-class sentiment analysis in Twitter,” Proceedings of the 2016 IEEE International Conference on Communications (ICC), pp. 1-6, 2016.
E.S. Alamoudi and N.S. Alamoudi, “Sentiment classification and aspect-based sentiment analysis on yelp reviews using deep learning and word embeddings,” Journal of Decision Systems, vol. 30, pp. 259-281, 2021.
M. Taboada, J. Brooke, M. Tofiloski, K. Voli and M. Sted, “Lexicon-Based Methods for Sentiment Analysis,” Association for Computational Linguistics, vol. 37, no. 2, pp. 267-307, 2011.
K. Namee, J. Polpinij and B. Luaphol, “A Hybrid Approach for Aspect-based Sentiment Analysis: A Case Study of Hotel Reviews,” Current Applied Science and Technology., vol. 23, no. 2, 2023.
J. Polpinij, N. Srikanjanapert and P. Sopon, “Word2Vec Approach for Sentiment Classification Relating to Hotel Reviews,” International Conference on Computing and Information Technology, pp. 308-316, 2017.
W. Medhat, A. Hassan and H. Korashy, “Sentiment analysis algorithms and applications: A survey,” Ain Shams Engineering Journal, vol. 5, no. 4, pp. 1093-1113, 2014.
M. Tsytsarau and T. Palpanas, “Survey on mining subjective data on the web,” Data Min Knowl Discovery, vol. 24, pp. 478-514, 2012.
L.C. Yu, J.L. Wu, P.C. Chang and H.S. Chu, “Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news,” Knowledge-Based Systems, vol. 41, pp. 89-97, 2013.
S. Mindrops and V. Kumar, “Multi-Class Sentiment Classification using Machine Learning and Deep Learning Techniques,” International Journal of Computer Sciences and Engineering, vol. 8, no. 11, pp. 14-20, 2020.
M. Attia, Y. Samih, A. Elkahky and L. Kallmeyer, “Multilingual Multi-class Sentiment Classification Using Convolutional Neural Networks,” The International Conference on Language Resources and Evaluation, 2018.
R.K. Das, M. Islam, M.M. Hasan, S. Razia, M. Hassan and S.A. Khushbu, “Sentiment analysis in multilingual context: Comparative analysis of machine learning and hybrid deep learning models,” Heliyon, vol. 9, no. 9, 2023.
H. Kim and Y. S. Jeong, “Sentiment Classification Using Convolutional Neural Networks,” Applied Sciences, vol. 9, no. 11, 2347, 2019.
G. Zhou and X. Huang, “Modeling and Mining Domain Shared Knowledge for Sentiment Analysis,” ACM Transactions on Information Systems (TOIS), vol. 36, no. 2, pp. 1-36, 2017.
B.P. Majumder and K. Mrini, “Exploring Domain Adaptability for Sentiment Classification Models,” Accessed on 15 May 2024, Available at https://www.semanticscholar.org/paper/Exploring-Domain-Adaptability-for-Senti-ment-Models-Majumder-Mrini/9d68f940aa5ef1b65bc11a85a33a3509ae551d9f
A.W. Pradana and M. Hayaty, “The Effect of Stemming and Removal of Stopwords on the Accuracy of Sentiment Analysis on Indonesian language Texts,” Kinetik, vol. 4, no. 4, 2019.
F. Song, Z. Guo and D. Mei, “Feature Selection Using Principal Component Analysis, Joint Conference on Lexical and Computational Semantics,” International Conference on System Informatization, 2010.
M. Maalouf, “Logistic regression in data analysis: an overview,” International Journal of Data Analysis Techniques and Strategies, vol.3, no.3, 2011.
B. Zou, “Multiple Classification Using Logistic Regression Model,” Internet of Vehicles, pp. 238–243, 2017.
J. Polpinij, K. Namee and B. Luaphol, “Bug reports identification using multiclassification method,” Science, Enigneering, and Health Studies, vol. 16, pp. 1-8, 2022.
J. Polpinij and B. Luaphol, “Comparing of Multi-class Text Classification Methods for Automatic Ratings of Consumer Reviews,” Multi disciplinary Trends in Artificial Intelligence, pp. 164-175, 2021.
A. Garg, S. Vats, G. Jaiswal and A. Sharma, “Analytical Approach for Sentiment Analysis of Movie Reviews Using CNN and LSTM,” Artificial Intelligence and Speech Technology, pp. 99–115, 2022.
S. L. Ramaswamy and C. Jayakumar, “Review on positional significance of LSTM and CNN in the multilayer deep neural architecture for efficient sentiment classification,” Journal of Intelligent & Fuzzy Systems, vol. 45, no. 4, pp. 6077-6105, 2023.
A. Areshey and H. Mathkour, “Transfer Learning for Sentiment Classification Using Bidirectional Encoder Representations from Transformers (BERT) Model,” Sensors, vol. 23, no. 11, 2023.
A. Hamza, K.B. Majeed, M. Rashad and A. Jaffar, “An Integrated Approach for Amazon Electronic Products Reviews by Using Sentiment Analysis,” Bulletin of business and economics, vol. 13, no. 2, pp. 142-153, 2024.
R. Caruana and A. Niculescu-Mizil, “Data mining in metric space: an empirical analysis of supervised learning performance criteria,” Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 69 – 78, 2004.
L. Lavazza and S. Morasca, “Common Problems With the Usage of F-Measure and Accuracy Metrics in Medical Research,” in IEEE Access, vol. 11, pp. 51515-51526, 2023.
C. Ling, J. Huang and H. Zhang, “AUC: A Better Measure than Accuracy in Comparing Learning Algorithms,” Canadian Conference on AI, 2003.