Application of Data Mining Techniques for Classification of Traffic Affecting Environments

Kanokwan Khiewwan; Phrommate  Weeraphan; Khumphicha  Tantisontisom; Jindaporn  Ongate

Authors

Kanokwan Khiewwan 1Computer Technology, Faculty of Industrial Technology, Kamphaeng Phet Rajabhat University 62000
Phrommate Weeraphan
Khumphicha Tantisontisom
Jindaporn Ongate

Keywords:

Traffic, Data Mining, Decision Tree, K Nearest Neighbor, Support Vector Machine

Abstract

This research aims to explore data mining techniques that is appropriate to classify traffic volume data and factors influencing on traffic. The traffic volume data was 31,147 records from the westbound traffic volume of MN DoT ATR station 301, roughly midway between Minneapolis and St Paul, MN. The data was retrieved from UCI Machine Learning Repository from 2014 to 2018. According to the experiment, Decision Tree (DT) is the technique that provide the highest accuracy of data classification of 79.57 percent, following with k Nearest Neighbors (k-NN) , accuracy of data classification of 73.27 percent with k=1 and Support Vector Machine (SVM) has the accuracy of data classification of 59.41 percent. Additionally, DT can identify that time is the most essential factor considering the traffic volume

References

Khaimook, S., Yoh, K., Inoi, H., & Doi, K. (2019). Mobility as a service for road traffic safety in a high use of motorcycle environment. IATSS Research, 43(4), 235 - 241.

Sun, Y. (2012). Research on urban road traffic congestion charging based on sustainable development. Physics Procedia, 24, 1567 - 1572.

Bhavsar, P., Safro, I., Bouaynaya, N., Polikar, R., & Dera, D. (2017). Machine learning in transportation data analytics. Data Analytics for Intelligent Transportation Systems. 283-307.

Zhang, L., Liu, Q., Yang, W., Nai, W., & Dong, D. (2013). An improved K-nearest neighbor model for short-term traffic flow prediction. Procedia - Social and Behavioral Sciences, 96, 653 - 662.

Antonio, J. (2016). Automated classification of urban locations for environmental noise impact assessment on the basis of road-traffic content. Expert Systems with Applications, 53, 1 - 13.

Dash, R. (2013). Selection of the best classifier from different datasets using WEKA. International Journal of Engineering Research & Technology (IJERT), 2(3), 1 - 7.

Tan, L. (2015). Code comment analysis for improving software quality. The Art and Science of Analyzing Software Data, 493 - 517.

Gove, R. (2012). Machine learning and event-based software testing: classifiers for identifying infeasible GUI event sequences. Advances in Computers, 86, 109 - 135.

Yan-yan, S., & Ying, L. (2015). Decision tree methods: applications for classification and prediction. Shanghai Archives of Psychiatry, 27(2), 130 - 135.

Imandoust, S. B., & Bolandraftar, M. (2013). Application of K-nearest neighbor (KNN) approach for predicting economic events: theoretical background. Journal of Engineering Research and Applications, 3(5), 605 - 610.

Entezari-Maleki, R., Rezaei, A., & Minaei-Bidgoli, B. (2009). Comparison of classification methods based on the type of attributes and sample size. Journal of Convergence Information Technology, 4(3), 94 - 102.

Center for Machine Learning and Intelligent Systems. (2019). Metro interstate traffic volume data set. Retrieved February 20, 2019, from https://archive.ics.uci.edu/ml/datasets/metro+interstate+traffic+volume