Feature Selection Method for Improving Customer Reviews Classification

Main Article Content

ธีรยุทธ คูณสุข
จารี ทองคำ

Abstract

This research proposes improvement of search for feature selection techniques to increase the efficiency of customer feedback classification about restaurants, Information is collected from the website wongnai.com, a total of 4,487 messages. The research team adopted 3 techniques for selecting text features: Chi-Square, Information Gain and Information Gain Ratio to measure the effectiveness of feature selection techniques and applied Naive Bayes, Support Vector Machine, K-Nearest Neighbor and C4.5 for classification. Moreover, the 10-fold Cross Validation has been used to divide the data into a learning set and measure accuracy (Accuracy), precision (Precision) and recall (Recall). From the experiment found that Information Gain feature selection technique cooperate with the Naive Bayes technique provides the best results in the classification of comments by the accuracy is 89.08 %, the precision is 89.12 % and the recall is 89.10 %.

Article Details

How to Cite
[1]
คูณสุข ธ. and ทองคำ จ., “Feature Selection Method for Improving Customer Reviews Classification”, RMUTI Journal, vol. 13, no. 1, pp. 129–143, Oct. 2019.
Section
Research article

References

Laowsungsuk, P., Jinda, A., and Sitthisarn, S. (2017). Sentiment Analysis of Restaurant Reviews on Review Web Sites. Thaksin University Journal. Vol. 20, No. 1, pp. 39-47

Srisuan, J. and Hanskunatai, A. (2014). An Application of Hotel Searching Based on Opinion Mining. In The 10th National Conference on Computing and Information Technology (NCCIT 2014). Faculty of Information Technology, King Mongkut’s University of Technology North Bangkok. pp. 95-100

Chotnasiaw, P., Songpan, W., Arch-int, S., and Saiyod, S. (2017). Analysis of Affecting Factors to Customer Reviews using Opinion Mining. In The 13th National Conference on Computing and Information Technology (NCCIT2017). Arnoma Grand Bangkok Hotel, Bangkok. pp. 44-50

Shah, F. P. and Patel, V. (2016). A Review on Feature Selection and Feature Extraction for Text Classification. In International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET). pp. 2264-2268. DOI: 10.1109/WiSPNET.2016.7566545

Haruechaiyasak, C., Jitkrittum, W., Sangkeettrakarn, C., and Damrongrat, C. (2008). Implementing News Article Category Browsing Based on Text Categorization Technique. In Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. Vol. 3, pp. 143-146. DOI: 10.1109/WIIAT.2008.61

Nuipian, V. and Meesad, P. (2013). A Comparison of Filter and Wrapper Approaches with Text Mining for Text Classification. The Journal of Industrial Technology. Vol. 9, No. 3, pp. 118-129

Marknakorn, N. (2013). Comparison of Feature Selection Methods for Inappropriate Webpage Classification by Data Mining Technique. Master of Science, Information Technology, King Mongkut’s University of Technology North Bangkok.

Thetmueang, R. and Jirawichitchai, N. (2017). Thai Sentiment Analysis of Product Review Online Using Support Vector Machine. Engineering Journal of Siam University. Vol. 18, Issue 1, No. 34, pp. 1-12

Pinmuang, N. and Thongkam, J. (2018). Classifying Thai Opinions on Online Media using Text Mining. Journal of Science and Technology Mahasarakham University. Vol. 37, No. 3, pp. 372-379

Tipsena, R., Jareanpon, C., and Somprasertsri, G. (2013). Automatic Question Classification on Webboard Using Text Mining Techniques. Journal of Science and Technology Mahasarakham University. Vol. 33, No. 5, pp. 493-502

Phopli, W., Boonmatham, S., and Meesad, P. (2017). Ensemble Feature Selection for Sparse Data. In The 13th National Conference on Computing and Information Technology (NCCIT2017). Arnoma Grand Bangkok Hotel, Bangkok. pp. 373-378

Saengsiri, P., Meesad, P., Wichian, S. N., and Herwig, U. (2010). Comparison of Hybrid Feature Selection Models on Gene Expression Data. In 2010 Eighth International Conference on ICT and Knowledge Engineering. pp. 13-18. DOI: 10.1109/ICTKE.2010.5692905

Buathong, W. (2014). Efficiency Improvement of Dimension Reduction by Feature Selection Methods for Data Classification. Master of Science, Information Technology, King Mongkut’s University of Technology North Bangkok

Pukkhem, N., Junmanee, C., and Ouisui, S. (2017). Automatic Thai Folk Wisdom Classification using Data Mining Approach. Thaksin University Journal. Vol. 20, No. 3, Special Issue 2017, pp. 300-307

Samal, B., Behera, A. K., and Panda, M. (2017). Performance Analysis of Supervised Machine Learning Techniques for Sentiment Analysis. In IEEE 3rd International Conference on Sensing, Signal Processing and Security. pp. 128-133

Kongthon, A., Haruechaiyasak, C., Pailai, J., and Kongyoung, S. (2012). The Role of Twitter During a Natural Disaster: Case Study of 2011 Thai Flood. In 2012 Proceedings of Technology Management for Emerging Technologies (PICMET). Vol. 12, pp. 2227-2232

Kaewta, C. and Mahawirawat, A. (2010). Diagnosis of Cases by Decision Tree Techniques. In National Conference on Information Technology (NCIT2010). The Grand Ayudhaya Hotel, Bangkok. pp. 308-313

Hossain, F. M. T., Hossain, M. I., and Nawshin, S. (2017). Machine Learning Based Class Level Prediction of Restaurant Reviews. In 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC). pp. 420-423. DOI: 10.1109/R10-HTC.2017.8288989