Sentiment Classification Based on Term Weighting with Class-mutual Information

Main Article Content

Uraiwan Buatoom
Kanyarat Cheawchan
Vajiraporn Sriput
Sombut Foithong

Abstract

Online platforms and information technology are developing quickly, which boosts the popularity of online e-commerce. Nowadays, posting content from client sentiments is vital for product makers to improve product quality as much as possible to exceed customers' expectations. Sentiment analysis is a process of natural language processing that finds out the sentiment and attitudes of users towards a product, whether positive or negative. Most sentiment and text classification research use term weighting with inverse document frequency (idf). However, assigning term weights using the idf method alone may not be effective enough to classify sentiment because this weight does not consider vital information classification views. This paper presents a supervised term weighting using the class mutual information calculated with the term frequency and the inverse document frequency. Experimental results show that the proposed method performs more effectively than term distribution and the term weighting that use only the inverse document frequency when considering by the performance indicator value: Accuracy, Precision, Recall and F1-Measure.

Article Details

Section
Research Article
Author Biography

Sombut Foithong, Applied Artificial Intelligent and Smart Technology Program, Faculty of Science and Arts, Burapha University, Chanthaburi Campus

 

 

 

References

O. Gokalp, E. Tasci, and A. Ugur, “A novel wrapper feature selection algorithm based on iterated greedy metaheuristic for sentiment classification,” Expert Syst. Appl., vol. 146, May 2020, Art. no. 113176.

W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and applications: A survey,” Ain Shams Eng. J., vol. 5, no. 4, pp. 1093–1113, Dec. 2014.

D. M. E.-D. M. Hussein, “A survey on sentiment analysis challenges,” J. King Saud Univ. – Eng. Sci., vol. 30, no. 4, pp. 330–338, Oct. 2018.

G. Wang, J. Sun, J. Ma, K. Xu, and J. Gu, “Sentiment classification: The contribution of ensemble learning,” Decis. Support Syst., vol. 57, pp. 77–93, Jan. 2014.

J. Chen, J. Yu, S. Zhao, and Y. Zhang, “User’s review habits enhanced hierarchical neural network for document-level sentiment classification,” Neural Process. Lett., vol. 53, pp. 2095–2111, Apr. 2021.

G. Wang, Z. Zhang, J. Sun, S. Yang, and C. A. Larson, “POS-RS: A random subspace method for sentiment classification based on part-of-speech analysis,” Inf. Process. Manage., vol. 51, no. 4, pp. 458–479, Jul. 2015.

C. Yang, X. Chen, L. Liu, and P. Sweetser, “Leveraging semantic features for recommendation: Sentence-level emotion analysis,” Inf. Process. Manage., vol. 58, no. 3, May 2021, Art. no. 102543.

R. K. Yadav, L. Jiao, O.-C. Granmo, and M. Goodwin, “Human-level interpretable learning for aspect-based sentiment analysis,” in Proc. 35th AAAI Conf. Artif. Intell. (AAAI-21), Palo Alto, CA, USA, Feb. 2021, pp. 14203–14212.

V. Lertnattee and T. Theeramunkong, “Effect of term distributions on centroid-based text categorization,” Inf. Sci., vol. 158, pp. 89–115, Jan. 2004.

U. Buatoom, W. Kongprawechnon, and T. Theeramunkong, “Improving seeded k-means clustering with deviation- and entropy-based term weightings,” IEICE Trans. Inf. Syst., vol. E103–D, no. 4, pp. 748–758, Apr. 2020.

H. Liu, X. Chen, and X. Liu, “A study of the application of weight distributing method combining sentiment dictionary and TF-IDF for text sentiment analysis,” IEEE Access, vol. 10, pp. 32280–32289, 2022, doi: 10.1109/ACCESS.2022.3160172.

I. Almalis, E. Kouloumpris, and I. Vlahavas, “Sector-level sentiment analysis with deep learning,” Knowl.-Based Syst., vol. 258, 2022, Art. no. 109954.

Y. Dang, Y. Zhang, and H. Chen, “A lexicon-enhanced method for sentiment classification: An experiment on online product reviews,” IEEE Intell. Syst., vol. 25, no. 4, pp. 46–53, 2010.

S. Foithong, O. Pinngern, and B. Attachoo, “Feature subset selection wrapper based on mutual information and rough sets,” Expert Syst. Appl., vol. 39, no. 1, pp. 574–584, 2012.

A. K. Paul and P. C. Shill, “Sentiment mining from Bangla data using mutual information,” in Proc. 2nd Int. Conf. Elect., Comput. & Telecommun. Eng. (ICECTE), Rajshahi, Bangladesh, Dec. 2016, pp. 1–4.

X. -Y. Jiang and J. Shui, “An improved mutual information-based feature selection algorithm for text classification,” in Proc. 5th Int. Conf. Intell. Human-Mach. Syst. and Cybernetics, Hangzhou, China, Aug. 2013, pp. 126–129.

A. Bagheri, M. Saraee, and F. de Jong, “Sentiment classification in Persian: Introducing a mutual information-based method for feature selection,” in Proc. 21st Iranian Conf. Elect. Eng. (ICEE), Mashhad, Iran, May 2013, pp. 1–6.

J. M. Sanchez-Gomez, M. A. Vega-Rodríguez, and C. J. Pérez, “The impact of term-weighting schemes and similarity measures on extractive multi-document text summarization,” Expert Syst. Appl., vol. 169, May 2021, Art. no. 114510.

V. Lertnattee and T. Theeramunkong, “Multidimensional text classification for drug information,” IEEE Trans. Inf. Technol. Biomed., vol. 8, no. 3, pp. 306–312, Sep. 2004, doi: 10.1109/TITB.2004.832542.

U. Buatoom, W. Kongprawechnon, and T. Theeramunkong, “Document clustering using k-means with term weighting as similarity-based constraints,” Symmetry, vol. 12, no. 6, p. 967, 2020.

M. Lan, C. L. Tan, J. Su, and Y. Lu, “Supervised and traditional term weighting methods for automatic text categorization,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 4, pp. 721–735, Apr. 2009.

Z. Feng, H. Zhou, Z. Zhu, and K. Mao, “Tailored text augmentation for sentiment analysis,” Expert Syst. Appl., vol. 205, Nov. 2022, Art. no. 117605.

H. Zhao, Z. Liu, X. Yao, and Q. Yang, “A machine learning-based sentiment analysis of online product reviews with a novel term weighting and feature selection approach,” Inf. Process. Manage., vol. 58, no. 5, Sep. 2021, Art. no. 102656.

I. Chaturvedi, E. Cambria, R. E. Welsch, and F. Herrera, “Distinguishing between facts and opinions for sentiment analysis: Survey and challenges,” Inf. Fusion, vol. 44, pp. 65–77, 2018.

D. Kotzias, M. Denil, N. D. Freitas, and P. Smyth, “From group to individual labels using deep features,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discovery and Data Mining, Sydney, Australia, Aug. 2015, pp. 597–606.