The Accuracy of Multiple Regression Models in Conditional Randomized Database Analysis to Predict Academic Performance of Naval Cadets.



Sasithorn Kongudomthrap


The objective of this academic articles is to present figure out the accuracy of multiple linear regression models as a tool for analyzing randomized databases based on specified conditions. A computer program was designed to create a database using conditional randomization with correlation between academic performance and the factors affecting the performance - personal factors and teaching factors. It is based on the actual study results in a course, probability and statistics for engineering in 2019 Royal Thai Naval Academy. After constructing this randomized databases and getting 30 academic year data out of the designed computer program, the data were analyzed by multiple linear regression models to find the equation used to predict the academic performances. The accuracy of the prediction when using one academic year from the created database to build the predicting equation is approximately 84%. When three years were used, the average accuracy was approximately 88.4%. When the data were used more from 5, 10, 15 and 20 academic years, the forecast accuracy is approximately 90%, 91.67%, 91.67% and 93% respectively. When tested with 30 year randomized databases, the forecast equation is

                                                y = -2.47317 + 0.13134a + 0.00209b  

when  a is an average of the personal factors and  b is an average of teaching factors.

This was able to accurately predict 96% of the students’ academic performances. The multiple linear regression models is gaining more predicting accuracy as the databases used to create predicting equation grow.






