Enhancing the Performance of Sentiment Analysis Models Using GridSearchCV: A Case Study on Electric Vehicles in Thailand
Main Article Content
Abstract
This study investigates the enhancement of sentiment analysis model performance through hyperparameter tuning using GridSearchCV, with a focus on electric vehicle reviews in Thailand. To address the challenges inherent in Thai language processing, PyThaiNLP was employed for word segmentation and text preprocessing. Four machine learning models-Support Vector Machines (SVM), Multinomial Naive Bayes, Logistic Regression, and Stochastic Gradient Descent-were implemented for sentiment classification. The hyperparameters of each model were systematically optimized to determine the configuration that maximizes accuracy, precision, recall, and F1-score. Experimental results revealed notable improvements across all models following optimization. The SVM model, identified as the best-performing classifier, achieved an accuracy increase from 75.00% to 76.69% and an F1-score improvement from 74.82% to 76.69%. The optimal SVM configuration employed a radial basis function kernel with a regularization parameter (C) of 10 and a gamma value of 0.1. These findings underscore the significance of hyperparameter optimization in improving model effectiveness and contribute to advancing sentiment analysis in linguistically complex environments such as Thai.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
All authors need to complete copyright transfer to Journal of Applied Informatics and Technology prior to publication. For more details click this link: https://ph01.tci-thaijo.org/index.php/jait/copyrightlicense
References
Alibrahim, H. and Ludwig, S. A. (2021). Hyperparameter optimization: Comparing genetic algorithm against grid search and bayesian optimization. In 2021 IEEE Congress on Evolutionary Computation (CEC), page 1551–1559. IEEE. DOI: 10.1109/cec45853.2021.9504761.
Ambesange, S., Vijayalaxmi, A., Sridevi, S., Venkateswaran, and Yashoda, B. S. (2020). Multiple heart diseases prediction using logistic regression with ensemble and hyper parameter tuning techniques. In 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), page 827–832. IEEE. DOI: 10.1109/worlds450073.2020.9210404.
Aroonmanakul, W., Nupairote, N., Muangsin, V., and Choemprayong, S. (2018). Thai monitor corpus: Challenges and contributions to Thai NLP. Vacana Journal of Language and Linguistics, 6(2):1–14.
Bergstra, J. and Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13:281–305. https://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf.
Bowornlertsutee, P. and Paireekreng, W. (2022). The model of sentiment analysis for classifying the online shopping reviews. Journal of Engineering and Digital Technology (JEDT), 10(1):71–79. https://ph01.tci-thaijo.org/index.php/TNIJournal/article/view/246375.
Feldman, R. (2013). Techniques and applications for sentiment analysis. Communications of the ACM, 56(4):82–89. DOI: 10.1145/2436256.2436274.
Haruechaiyasak, C. and Kongthon, A. (2013). LexToPlus: A Thai Lexeme Tokenization and Normalization Tool. In Bhattacharyya, P. and Malik, M. G. A., editors, Proceedings of the 4th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP), pages 9–16, Nagoya Congress Center, Nagoya, Japan. Asian Federation of Natural Language Processing. https://aclanthology.org/W13-4702/.
Hutter, F., Kotthoff, L., and Vanschoren, J., editors (2019). Automated Machine Learning: Methods, Systems, Challenges. The Springer Series on Challenges in Machine Learning. Springer International Publishing. https://link.springer.com/book/10.1007/978-3-030-05318-5.
Li, M., Li, Z., Huang, C., Jiang, Y., and Wu, X. (2024). EduGraph: Learning path-based hypergraph neural networks for MOOC course recommendation. IEEE Transactions on Big Data, 10(6):706–719. DOI: 10.1109/tbdata.2024.3453757.
Liu, B. (2012). Sentiment Analysis and Opinion Mining, volume 5 of Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers.
Netisopakul, P. and Thong-iad, K. (2018). Thai Sentiment Resource Using Thai WordNet, page 329–340. Springer International Publishing. DOI: 10.1007/978-3-319-93659-8_29.
Pang, B. and Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2):1–135. Pre-publication version, available at https://www.cs.cornell.edu/home/llee/omsa/omsa.pdf.
Qaiser, S., Yusoff, N., Ali, R., Remli, M. A., and Adli, H. K. (2021). A comparison of machine learning techniques for sentiment analysis. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(3):1738–1744. DOI: 10.17762/turcomat.v12i3.999.
Qi, Y. and Shabrina, Z. (2023). Sentiment analysis using Twitter data: A comparative application of lexicon- and machine-learning-based approach. Social Network Analysis and Mining, 13(1). DOI: 10.1007/s13278-023-01030-x.
Rabruen, A., Pokkate, P., Surinta, O., and Khruahong, S. (2025). Sentiment analysis of Thai laborers’ perceptions of working abroad: A machine learning approach using Youtube comments. ICIC Express Letters, Part B: Applications, 16(3):331–341. DOI: 10.24507/icicelb.16.03.333.
Ren, J., Lee, S. D., Chen, X., Kao, B., Cheng, R., and Cheung, D. (2009). Naive Bayes classification of uncertain data. In 2009 Ninth IEEE International Conference on Data Mining. IEEE. DOI: 10.1109/icdm.2009.90.
Sanguesa, J. A., Torres-Sanz, V., Garrido, P., Martinez, F. J., and Marquez-Barja, J. M. (2021). A review on electric vehicles: Technologies and challenges. Smart Cities, 4(1):372–404. DOI: 10.3390/smartcities4010022.
Siji George, C. G. and Sumathi, B. (2020). Grid search tuning of hyperparameters in random forest classifier for customer feedback sentiment prediction. International Journal of Advanced Computer Science and Applications (IJACSA), 11(9):173–178. DOI: 10.14569/IJACSA.2020.0110920.
Syarif, I., Prugel-Bennett, A., and Wills, G. (2016). SVM parameter optimization using grid search and genetic algorithm to improve classification performance. TELKOMNIKA (Telecommunication Computing Electronics and Control), 14(4):1502. DOI: 10.12928/telkomnika.v14i4.3956.
Thananusak, T., Rakthin, S., Tavewatanaphan, T., and Punnakitikashem, P. (2017). Factors affecting the intention to buy electric vehicles: Empirical evidence from Thailand. International Journal of Electric and Hybrid Vehicles, 9(4):361. DOI: 10.1504/ijehv.2017.089875.
Vichianchai, V. and Kasemvilas, S. (2024). Thai word segmentation using a replacing the English alphabet approach to enhance Thai text sentiment analysis. Journal of Applied Informatics and Technology, 6(2):158–178. DOI: 10.14456/jait.2024.10.
Oz¸cift, A., Kılın¸c, D., and Bozyi˘git, F. (2019). Application of grid search parameter optimized bayesian logistic regression algorithm to detect cyberbullying in turkish microblog data. Academic Platform Journal of Engineering and Science, 7(3):355–361. DOI: 10.21541/apjes.496018.