A hybrid model using MaLSTM based on recurrent neural networks with support vector machines for sentiment analysis

Main Article Content

Srinidhi H
Siddesh GM
Srinivasa KG

Abstract

Sentiment analysis is an ongoing research area in the field of data science. It helps in gathering insights into the behaviors of the users and the products associated with them. Most sentiment analysis applications focus on tweets from twitter using hashtags. However, if the reviews are taken by themselves, more clarity on the sentiments behind them is available. The primary challenge in sentiment analysis is identifying keywords to determine the polarity of the sentence. In this paper, a hybrid model is proposed using a Manhattan LSTM (MaLSTM) based on a recurrent neural network (RNN), i.e., long-short term memory (LSTM) combined with support vector machines (SVM) for sentiment classification. The proposed method focuses on learning the hidden representation from the LSTM and then determine the sentiments using SVM. The classification of the sentiments is carried out on the IMDB movie review dataset using a SVM approach based on the learned representations of the LSTM. The results of the proposed model outperform existing models that are based on hashtags.

Downloads

Download data is not yet available.

Article Details

How to Cite
H, S., GM, S., & KG, S. (2020). A hybrid model using MaLSTM based on recurrent neural networks with support vector machines for sentiment analysis. Engineering and Applied Science Research, 47(3), 232-240. Retrieved from https://ph01.tci-thaijo.org/index.php/easr/article/view/223697
Section
ORIGINAL RESEARCH

References

[1] Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. Neural Information Processing Systems 27 (NIPS 2014); 2014 Dec 8-13; Montréal, Canada. p. 3104-12.

[2] Graves A. Supervised sequence labelling. In: Kacprzyk J, editor. Supervised sequence labelling with recurrent neural networks. Berlin, Heidelberg: Springer; 2012. p. 5-13.

[3] Gers FA, Schmidhuber J, Cummins F. Learning to forget: continual prediction with LSTM. Neural Comput. 2000;12(10):2451-71.

[4] Hochreiter S, Bengio Y, Frasconi P, Schmidhuber J. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kolen JF, Kremer SC, editor. A Field Guide to Dynamical Recurrent Networks. USA: IEEE; 2001. p. 237-43.

[5] Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Network. 1994;5(2):157-66.

[6] Tai KS, Socher R, Manning CD. Improved semantic representations from tree-structured long short-term memory networks. arXiv:1503.00075. 2015:1-11.

[7] Adya M, Collopy F. How effective are neural networks at forecasting and prediction? A review and evaluation. J Forecast. 1998;17(5‐6):481-95.

[8] Lukoševičius M, Jaeger H. Reservoir computing approaches to recurrent neural network training. Comput Sci Rev. 2009;3(3):127-49.

[9] Schmidhuber J, Gagliolo M, Wierstra D, Gomez F. Evolino for recurrent support vector machines. arXiv:cs/0512062. 2005:1-10.

[10] Schmidhuber J, Wierstra D, Gagliolo M, Gomez F. Training recurrent networks by evolino. Neural Comput. 2007;19(3):757-79.

[11] Schneegaß D, Schaefer AM, Martinetz T. The Intrinsic Recurrent Support Vector Machine. ESANN'2007 proceedings - European Symposium on Artificial Neural Networks; 2007 Apr 25-27; Bruges, Belgium. p. 325-30.

[12] Palangi H, Deng L, Shen Y, Gao J, He X, Chen J, et al. Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval. IEEE/ACM Trans Audio Speech Lang Process (TASLP). 2016;24(4):694-707.

[13] Dai AM, Le QV. Semi-supervised sequence learning. Neural Information Processing Systems 28 (NIPS 2015); 2015 Dec 7-15; Montréal, Canada. p. 3079-87.

[14] Kiros R, Zhu Y, Salakhutdinov RR, Zemel R, Urtasun R, Torralba A, et al. Skip-thought vectors. Neural Information Processing Systems 28 (NIPS 2015); 2015 Dec 7-15; Montréal, Canada. p. 3294-302.

[15] Zhang X, Zhao J, LeCun Y. Character-level convolutional networks for text classification. Neural Information Processing Systems 28 (NIPS 2015); 2015 Dec 7-15; Montréal, Canada. p. 649-57.

[16] Moraes R, Valiati JF, Neto WPG. Document-level sentiment classification: An empirical comparison between SVM and ANN. Expert Syst Appl. 2013;40(2):621-33.

[17] Xu J, Chen D, Qiu X, Huang X. Cached long short-term memory neural networks for document-level sentiment classification. arXiv:1610.04989. 2016:1-10.

[18] Teng Z, Vo DT, Zhang Y. Context-sensitive lexicon features for neural sentiment analysis. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing; 2016 Nov 1-5; Austin, Texas. USA: Association for Computational Linguistics; 2016. p. 1629-38.

[19] Zhang L, Wang S, Liu B. Deep learning for sentiment analysis: A survey. WIREs: Data Min Knowl Discov. 2018;8(4):e1253.

[20] Tang D, Zhang M. Deep learning in sentiment analysis. In: Deng L, Liu Y, editor. Deep Learning in Natural Language Processing. Singapore: Springer; 2018. p. 219-53.

[21] Al-Smadi M, Qawasmeh O, Al-Ayyoub M, Jararweh Y, Gupta B. Deep recurrent neural network vs support vector machine for aspect-based sentiment analysis of Arabic hotels’ reviews. J Comput Sci. 2018;27:386-93.

[22] Nowak J, Taspinar A, Scherer R. LSTM recurrent neural networks for short text and sentiment classification. In: Rutkowski L, Korytkowski M, Scherer R, Tadeusiewicz R, Zadeh L, Zurada J, editors. International Conference on Artificial Intelligence and Soft Computing; 2017 Jun 11-15; Zakopane, Poland. Springer; 2017. p. 553-62.

[23] Mueller J, Thyagarajan A. Siamese recurrent architectures for learning sentence similarity. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence; 2016 Feb 12-17; Arizona, USA. USA: AAAI; 2016. p. 2786-92.

[24] Mihalcea R, Corley C, Strapparava C. Corpus-based and knowledge-based measures of text semantic similarity. Proceedings of the 21st national conference on Artificial intelligence; 2006 Jul 16-20; Massachusetts, USA. USA: AAAI; 2006. p. 775-80.

[25] Rehurek R, Sojka P. Software framework for topic modelling with large corpora. Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks; 2010 May 22; Valletta, Malta. p. 46-50.

[26] Kumar S, Zymbler M. A machine learning approach to analyze customer satisfaction from airline tweets. J Big Data. 2019;6:1-16.

[27] Wazery YM, Mohammed HS, Houssein EH. Twitter sentiment analysis using deep neural network. 14th International Computer Engineering Conference (ICENCO); 2018 Dec 29-30; Cairo, Egypt. USA: IEEE; 2018. p. 1-6.

[28] Yih WT, Toutanova K, Platt JC, Meek C. Learning discriminative projections for text similarity measures. Proceedings of the fifteenth conference on computational natural language learning; 2011 Jun 23-24; Oregon, USA. USA: Association for Computational Linguistics; 2011. p. 247-56.

[29] Chen K, Salman A. Extracting speaker-specific information with a regularized siamese deep network. Advances in Neural Information Processing Systems 24 (NIPS 2011); 2011 Dec 12-17; Granada Spain. p. 298-306.

[30] Zeiler MD. ADADELTA: an adaptive learning rate method. arXiv:1212.5701. 2012:1-6.

[31] Li Y, Xu L, Tian F, Jiang L, Zhong X, Chen E. Word embedding revisited: a new representation learning and explicit matrix factorization perspective. Proceedings of the 24th International Conference on Artificial Intelligence; 2015 Jul 25-31; Buenos Aires, Argentina. p. 3650-6.

[32] Rosenthal S, Ritter A, Nakov P, Stoyanov V. SemEval-2014 Task 9: Sentiment analysis in Twitter. Proceedings of the 8th International Workshop on Semantic Evaluation, SemEval ’14; 2014 Aug 23-24; Dublin, Ireland. p. 73-80.

[33] Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C. Learning word vectors for sentiment analysis. Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies; 2011 Jun 19-24; Oregon, USA. USA: Association for Computational Linguistics; 2011. p. 142-50.

[34] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics; 2010 May 13-15; Sardinia, Italy. p. 249-56.

[35] Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467. 2016:1-19.

[36] Kaggle [Internet]. 2015 [cited 2015 Jul 9]. Available from: https://www.kaggle.com/iarunava/imdb-movie-reviews-dataset.

[37] Chetlur S, Woolley C, Vandermersch P, Cohen J, Tran J, Catanzaro B, et al. cudnn: Efficient primitives for deep learning. arXiv:1410.0759. 2014:1-9.

[38] Maaten LVD, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008;9:2579-605.

[39] Hochreiter S, Schmidhuber J Long short-term memory. Neural Comput. 1997;9(8):1735-80.

[40] Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. Proceedings of the 30th International Conference on Machine Learning; 2013 Jun 16-21; Atlanta, USA. p. 1310-8.

[41] Kaggle [Internet]. 2015 [cited 2015 Jul 9]. Available from: https://www.kaggle.com/c/si650winter11/data.

[42] Chopra S, Hadsell R, LeCun Y. Learning a similarity metric discriminatively, with application to face verification. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05); 2005 Jun 20-25; San Diego, USA. USA: IEEE; 2005. p. 539-46.