การเปรียบเทียบวิธีการเติมข้อมูลสูญหายในตัวแปรตามที่เกิดการสูญหายแบบสุ่ม สำหรับการถดถอยเชิงเส้นพหุคูณ

สุปรียา สระโสม; ธิดาเดียว  มยุรีสวรรค์

PDF

Published: Dec 30, 2019

Keywords:

Missing data imputation Multiple linear regression Average mean square error

สุปรียา สระโสม

สาขาวิชาสถิติ คณะวิทยาศาสตร์ มหาวิทยาลัยขอนแก่น อ.เมือง จ.ขอนแก่น

ธิดาเดียว มยุรีสวรรค์

สาขาวิชาสถิติ คณะวิทยาศาสตร์ มหาวิทยาลัยขอนแก่น อ.เมือง จ.ขอนแก่น

Abstract

This research is to develop missing data imputation methods in dependent variable for multiple linear regression with missing at random in dependent variable, namely the Mean Regression Imputation method (MRI), the Expectation Maximization with Multiple Imputation method (EMMI) and the Nearest Average Regression Imputation method (NARI). Comparison of the efficiency of the develop methods with 6 methods, namely the Regression Imputation method (RI), the Stochastic Regression Imputation method (SRI), the K Nearest Neighbour Imputation method (KNN), the Expectation Maximization Algorithm method (EM), the Multiple Imputation method (MI) and the Proportioned Residual Draw Imputation method (PRD). The simulation study with R program where the standard deviations of error ( $gif.latex?\sigma$ ) were set to be 5, 10 and 15, and sample sizes (n) were 30, 50, 100 and 200, and missing percentages were 5, 10, 15 and 20. The criteria for compare the performance is an Average Mean Square Error (AMSE). The results found that, the EMMI method has the best performance for all level of sample sizes at $gif.latex?\sigma$ is equal to 5 and missing percentage is equal to 5. The MRI method performs better than the others at all level of sample sizes when $gif.latex?\sigma$ is equal to 10 and missing percentage is equal to 5, and the MRI method still performs the best when $gif.latex?\sigma$ is equal to 15 in all missing percentages and almost of all sample size levels. The result for real data at n = 50, the MRI method has the most effective in all level of missing percentages.

How to Cite

สระโสม ส., & มยุรีสวรรค์ ธ. . (2019). Comparison of Missing Data Imputation Methods in Dependent Variable with Missing at Random for Multiple Linear Regression. KKU Science Journal, 47(4), 737–748. retrieved from https://ph01.tci-thaijo.org/index.php/KKUSciJ/article/view/250058

Issue

Vol. 47 No. 4 (2019): October - December 2019

Section

Research Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Article Sidebar

Main Article Content

Abstract

Article Details