Diagnostic Prediction Models for Cardiovascular Disease Risk using Data Mining Techniques

Main Article Content

Nongyao Nai-arun
Rungruttikarn Moungmai


Cardiovascular disease is the top national health problem that leads to a big number of deaths in Thailand. There is still a growing number of patients with the disease. Proactive measures of disease prevention and disease control are searching for risk groups. Therefore, people who are at risk can diagnose and manage themselves to reduce risk factors and adjust their behavior accordingly. For this reason, the idea of diagnostic prediction models for Cardiovascular was conducted. The data of patients from 126 health promoting hospitals and 12 hospitals in Saraburi Province were collected. Then, the analysis was done to establish 6 models namely logistic regression, random forest, back-propagation neural network, decision tree, naïve bayes and K-nearest neighbors. Moreover, 10-fold cross validation was applied into the process of each model. The results revealed that the logistic regression model achieved the highest accuracy rate, 99.940%, followed by the back-propagation neural network model, 98.506%. The best model should be developed as a web application to search for new patients or risk groups. It will help to prevent and control the disease quickly and also to reduce mortality.

Article Details

How to Cite
N. Nai-arun and R. Moungmai, “Diagnostic Prediction Models for Cardiovascular Disease Risk using Data Mining Techniques”, ECTI-CIT, vol. 14, no. 2, pp. 113-121, Jun. 2020.
Research Article


[1] World Health Organization, Noncommunicable diseases, [Online]. Available: https://www.who.int/
news-room/fact-sheets/detail/noncommunicable-diseases. [17-January-2019].

[2] World Health Organization, Cardiovascular Disease, [Online]. Available: https://www.who.int/
cardiovascular_diseases/en/. [25-January-2019].

[3] Ministry of Public Health, Cardiovascular Disease, [Online]. Available: https://www.moph.go.th/

[4] Bureau of Non Communicable Disease, The issue of the World Heart Day campaign, 2018, [Online]. Available: https://thaincd.com/document/file/download/knowledge/ [10-April-2019].

[5] P-N. Tan, M. Steinbach and V. Kumar, Introduction to Data Mining, Addison Wesley, USA, 2006.

[6] I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, 2nd ed, Morgan Kaufman, USA, 2005.

[7] J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, 3rd ed, Morgan Kaufman, USA, 2012.

[8] P. Sittidech, and N. Nai-arun, “Random Forest Analysis on Diabetes Complication Data.” in International Conference Biomedical Engineering (BioMed), pp.315-320, 2014.

[9] S. Dua, “Data mining and fusion paradigms of clinical informatics”. In Health data standards: From reimbursement to clinical excellence, pp.107-117. Bangkok, Mahidol University, 2011

[10] S. Chakrabarti, E. Cox, E. Frank, R. Güting, J. Han, X. Jiang, M. Kamber, S. Lightstone, T. Nadeau, R. E. Neapolitan, D. Pyle, M. Refaat, M. Schneider, T. Teorey, I. Witten, Data Mining: Know It All, Morgan Kaufmann, USA, 2008.

[11] D-Y. Yeh, C-H. Cheng and Y-W. Chen, “A predictive model for cerebrovascular disease using data mining,” Journal of Expert System with Application, Vol. 38, pp.8970-8977, 2011.

[12] U. Suksawatchon, J. Suksawatchon, and W. Lawang “Health Risk Analysis Expert System for Family Caregiver of Person with Disabilities using Data Mining Techiques,” International Journal of ECTI Transactions on computer and Information Technology, Vol. 12, No. 1, pp.62-72, 2018.

[13] N. Rachata, W. Rueangsirarak, C. Kamyod, and P. Temdee, “Fuzzy-based Risk Prediction Model for Cardiovascular Complication of Patient wih Type 2 Diabetes Mellitus and Hypertension,” International Journal of ECTI Transactions on computer and Information Technology, Vol. 13, No. 1, pp.41-50, 2019.

[14] P, K. Saxena. R. Sharma, “Efficient Heart Disease Prediction System,” International Journal of Procedia Computer Science, 85, pp.962-969, 2016.

[15] R. Assari, P. Azimi and M. R. Taghva, “Heart Disease Diagnosis Using Data Mining Techniques,” International Journal of Economics & Management Sciences, Vol. 6, pp.101-105, 2017.

[16] N. Nai-arun, and R. Moungmai, “Comparison of Classifiers for the Risk of Diabetes Prediction,” International Journal of Procedia Computer Science, 69, pp.132-142, 2015.

[17] M. H. Dunham, Data Mining, Introductory and Advanced Topics, Prentice Hall, USA, 2002.

[18] M. T. Jones, “Artificial Intelligence: A Systems Approach,” Infinity Science, Hingham, 2008.

[19] Towards Data Science, Machine learning fundamentals (II): Neural networks, [Online]. Available: https://towardsdatascience.com/machine-learning-fundamentals-ii-neural-networks-f1e7b2cb3eef?. [13-March-2019]

[20] K. Kitbumrungrat, “Multinomial Logistic Regression Model for Learning Classification and Ordinal Logistic Regression Model for Student Grade Analysis,” Varidian E-Journal of Science and Technology Silpakorn University, Vol. 4, No. 2, pp.19-35, 2017.

[21] Saraburi Provincial Health Office, Cardiovascular Disease, [Online]. Available: https://www.sro.moph.
go.th/ ewtadmin/ewt/saraburi_web/main.php?.[11-May-2019].

[22] Ministry of Public Health, Guidelines for Assessment of Cardiovascular Risk, The War Veterans Organization., Bangkok, 2019.