A Comparison of Subset Selection in Multiple Linear Regression Model by Using Bayesian Variable Selection and Stepwise Regression

Main Article Content

Kannat Na Bangchang

Abstract

The purpose of this research is to study variable selection for multiple linear regression models with Bayesian Variable Selection by using Gibbs Sampling (GS) and compare the efficiency of variable selection between Gibbs Sampling (GS) and Stepwise Regression (SR). The study employs simulation data which compared both cases with and without multicollinearity. A sample size of 25 and 100 is utilized, and repeated 500 times for each sample size. A significance level of 0.05 is set for selecting independent variables using SR. The comparison criterion includes the percentage of accurate variable selection and the mean squared error (MSE). When simulated dataset is small sample size, the percentage of accuracy with GS is higher than SR in both cases with and without multicollinearity. However, as the sample size increases, percentage of accurate of SR also increases for case without multicollinearity which is equivalent to GS. The MSE for both methods is equal, which is very low. This outcome is due to the similarity between true values and predicted values of the dependent variables obtained from each method. Conversely, when there is multicollinearity, both GS and SR are unable to accurately select variables for the model.

Article Details

How to Cite
Na Bangchang, K. (2026). A Comparison of Subset Selection in Multiple Linear Regression Model by Using Bayesian Variable Selection and Stepwise Regression. KKU Science Journal, 54(1), 207–214. https://doi.org/10.14456/kkuscij.2026.15
Section
Research Articles

References

Burnham, K.P. and Anderson, D.R. (2002). Model selection and multimodel inference: A practical information-theoretic approach. 2nd Edition. New York: Springer-Verlag.

Efroymson, M.A. (1960). Multiple Regression Analysis. In: Ralston A. and Wilf, H., Eds., Mathematical Methods for Digital Computers. New York: John Wiley.

Lee, S. (2021). Bayesian linear regression with Gibbs sampling using R code [Internet]. Source: https://shorturl.asia/vx6jB. Retrieved date 17 August 2025.

Mitchell, T. and Beauchamp, J. (1988). Bayesian variable selection in linear regression. Journal of the American Statistical Association. 83(404): 1023 - 1032. doi: 10.2307/2290129.

Tippayawannakorn, N. (1997). Comparison of predicted values of dependent variables using model selection by Bayesian variable selection, backward elimination and stepwise regression with hierarchical polynomial regression. Dissertation, Chulalongkorn University. Bangkok.

Viadinugroho, R. (2021). Generate simulated dataset for linear model in R. Source: https://shorturl.asia/T8PRO. Retrieved date 17 August 2025.

Yardimci, A and Erar, A. (2002). Bayesian variable selection in linear regression and a comparison. Hacettepe Journal of Mathematics and Statistics. 31: 63 - 76.