Applied big data technique and deep learning for massive open online courses (MOOCs) recommendation system

Main Article Content

Siriporn Sakboonyarat
Panjai Tantatsanawong


As traditional recommendation techniques suffer from scalability problems resulting in poor-quality recommendations, they cannot be effectively used on big data. With the immense amount of emerging online learning resources nowadays, it has become harder for users to find and select their preferred content. Similarly, course recommendation systems also face an information overload problem. Most recommendation systems are created based on their own learning management systems and can only be used with those systems. Furthermore, the storage and processing of these systems cannot be updated, which makes them unsuitable for real-world problems, because data is continuously changing and emerging. Focusing on the aforementioned problem, in this study, we propose a novel online recommender system, namely, MCR-C-FGM. It runs on clusters and is trained with a fit-generator method which uses the Apache platform to distribute the processing of large datasets along with a clustering model created by a Deep Neural Network and Long Short-Term Memory. The network is trained with the fit-generator method. The test results with real MOOCs data from Harvard University and MIT, which were published in edX, show a high precision rate of 75%, an accuracy rate of 76%, and a recall rate of 78% in the evaluation processes. The time efficiency during the training process improves by 35% compared to the non-clustering model. Moreover, the MCR-C-FGM is capable of being scaled out, which allows it to efficiently support big data.

Article Details

How to Cite
S. Sakboonyarat and P. Tantatsanawong, “Applied big data technique and deep learning for massive open online courses (MOOCs) recommendation system”, ECTI-CIT Transactions, vol. 16, no. 4, pp. 436–447, Oct. 2022.
Research Article


A. K. Sahu, and P. Dwivedi, “User profile as a bridge in cross-domain recommender systems for sparsity reduction,” Applied Intelligence, vol. 49, no. 7, pp. 2461-2481, 2019/07/01, 2019.

E. Ashraf, S. Manickam, and S. Karuppayah, “A COMPREHENSIVE REVIEW OF COURSE RECOMMENDER SYSTEMS IN E-LEARNING,” Journal of Educators Online, vol. 18, no. 1, pp. 23-35, 01//, 2021.

D. Shah. "By The Numbers: MOOCs in 2020,"

M. Gheisari, G. Wang, and M. Z. A. Bhuiyan, "A Survey on Deep Learning in Big Data." pp. 173-180.

J. Bobadilla, F. Ortega, A. Hernando, and A. Gutiérrez, “Recommender systems survey,” Knowledge-Based Systems, vol. 46, pp. 109-132, 2013/07/01/, 2013.

S. Dargan, M. Kumar, M. R. Ayyagari, and G. Kumar, “A Survey of Deep Learning and Its Applications: A New Paradigm to Machine Learning,” Archives of Computational Methods in Engineering, vol. 27, no. 4, pp. 1071-1092, 2020/09/01, 2020.

H. Fujiyoshi, T. Hirakawa, and T. Yamashita, “Deep learning-based image recognition for autonomous driving,” IATSS research, vol. 43, no. 4, pp. 244-252, 2019.

D. W. Otter, J. R. Medina, and J. K. Kalita, “A Survey of the Usages of Deep Learning for Natural Language Processing,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 2, pp. 604-624, 2021.

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436-444, 2015/05/01, 2015.

Y. Li, and S. Kang, “Artificial bandwidth extension using deep neural network-based spectral envelope estimation and enhanced excitation estimation,” IET Signal Processing, vol. 10, no. 4, pp. 422-427, 2016.

G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. W. M. van der Laak, B. van Ginneken, and C. I. Sánchez, “A survey on deep learning in medical image analysis,” Medical Image Analysis, vol. 42, pp. 60-88, 2017/12/01/, 2017.

M. Mroczek, A. Desouky, and W. Sirry, “Imaging Transcriptomics in Neurodegenerative Diseases,” Journal of Neuroimaging, vol. 31, no. 2, pp. 244-250, 2021.

C. Kleanthous, and S. Chatzis, “Gated Mixture Variational Autoencoders for Value Added Tax audit case selection,” Knowledge-Based Systems, vol. 188, pp. 105048, 2020/01/05/, 2020.

T. Alashkar, S. Jiang, S. Wang, and Y. Fu, “Examples-rules guided deep neural network for makeup recommendation,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, California, USA, 2017, pp. 941–947.

R. Devooght, and H. Bersini, “Long and Short-Term Recommendations with Recurrent Neural Networks,” in Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization, Bratislava, Slovakia, 2017, pp. 13–21.

A. M. Elkahky, Y. Song, and X. He, “A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems,” in Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 2015, pp. 278–288.

P. Covington, J. Adams, and E. Sargin, "Deep neural networks for youtube recommendations." pp. 191-198.

S. B. Aher, and L. M. R. J. Lobo, “Combination of machine learning algorithms for recommendation of courses in E-Learning System based on historical data,” Knowledge-Based Systems, vol. 51, pp. 1-14, 2013/10/01/, 2013.

H. Zhang, T. Huang, Z. Lv, S. Liu, and Z. Zhou, “MCRS: A course recommendation system for MOOCs,” Multimedia Tools and Applications, vol. 77, no. 6, pp. 7051-7069, 2018/03/01, 2018.

H. Zhang, T. Huang, Z. Lv, S. Liu, and H. Yang, “MOOCRC: A Highly Accurate Resource Recommendation Model for Use in MOOC Environments,” Mob. Netw. Appl., vol. 24, no. 1, pp. 34–46, 2019.

Y. Pang, C. Liao, W. Tan, Y. Wu, and C. Zhou, "Recommendation for MOOC with Learner Neighbors and Learning Series," Web Information Systems Engineering – WISE 2018. pp. 379-394.

HarvardX, "HarvardX Person-Course Academic Year 2013 De-Identified dataset, version 3.0," Harvard Dataverse, 2014.

J. Maldonado-Mahauad, M. Pérez-Sanagustín, R. F. Kizilcec, N. Morales, and J. Munoz-Gama, “Mining theory-based patterns from Big data: Identifying self-regulated learning strategies in Massive Open Online Courses,” Computers in Human Behavior, vol. 80, pp. 179-196, 2018/03/01/, 2018.

J. Liang, J. Yang, Y. Wu, C. Li, and L. Zheng, "Big Data Application in Education: Dropout Prediction in Edx MOOCs." pp. 440-443.

K. D. Strang, “Beyond engagement analytics: which online mixed-data factors predict student learning outcomes?,” Education and Information Technologies, vol. 22, no. 3, pp. 917-937, 2017/05/01, 2017.

S. P. Wang, and W. Kelly, “Video-based Big Data Analytics in Cyberlearning,” Journal of learning Analytics, vol. 4, pp. 36-46, 2017.

A. Marchand, and P. Marx, “Automated Product Recommendations with Preference-Based Explanations,” Journal of Retailing, vol. 96, no. 3, pp. 328-343, 2020/09/01/, 2020.

L. Duan, and Y. Xiong, “Big data analytics and business analytics,” Journal of Management Analytics, vol. 2, no. 1, pp. 1-21, 2015/01/02, 2015.

K. Chen, J. Powers, S. Guo, and F. Tian, “CRESP: Towards Optimal Resource Provisioning for MapReduce Computing in Public Clouds,” IEEE Transactions on Parallel and Distributed Systems, vol. 25, no. 6, pp. 1403-1412, 2014.

F. Shabestari, A. M. Rahmani, N. J. Navimipour, and S. Jabbehdari, “A taxonomy of software-based and hardware-based approaches for energy efficiency management in the Hadoop,” Journal of Network and Computer Applications, vol. 126, pp. 162-177, 2019/01/15/, 2019.

A. Sherstinsky, “Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network,” Physica D: Nonlinear Phenomena, vol. 404, pp. 132306, 2020/03/01/, 2020.

J. Zhao, X. Mao, and L. Chen, “Speech emotion recognition using deep 1D & 2D CNN LSTM networks,” Biomedical Signal Processing and Control, vol. 47, pp. 312-323, 2019.

J. Zhao, F. Deng, Y. Cai, and J. Chen, “Long short-term memory-Fully connected (LSTM-FC) neural network for PM2. 5 concentration prediction,” Chemosphere, vol. 220, pp. 486-492, 2019.