Development Cyber Risk Assessment for Intrusion Detection Using Enhanced Random Forest
Main Article Content
Abstract
In cybersecurity, the lack of statistical data on cyber-attacks presents a significant challenge from an insurance perspective, hindering the accurate calculation of insurance premiums, furthermore assessing cybersecurity risk exposure and identifying high-risk threat categories. Effective intrusion detection systems (IDS) are paramount in addressing these issues. This research introduces a sophisticated cyber risk assessment model utilizing the Random Forest classification algorithm, tailored explicitly for IDS, and leverages the comprehensive CIC-IDS 2017 dataset. The central objective was to engineer robust models capable of classifying a broad array of cyber threats, focusing on classification accuracy. The model achieved an accurate average classification rate of 96.94% through systematic experimentation and hyperparameter tuning.
This study found that 'n_estimators' values of 10 to 300 did not affect cyberattack performance. It was also shown that Bagging and bootstrapping improve model stability by mitigating variance and improving accuracy without many trees. Model performance was high, with an average F1-Score of 97.86%. Cyber-attack statistics are scarce, and from an insurance perspective, the lack of statistical data on cyber-attacks hinders the calculation of insurance premiums. Risk assessment allows for informed self-insurance or risk transfer processes ensuring that policies align with risk management strategies and premium calculations.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
C. Zhang et al., “Three-Way Selection Random Forest Optimization Model for Anomaly Traffic Detection,” Electronics, vol. 12, no. 8, p. 1788, Apr. 2023.
C. Zhang, W. Wang, L. Liu, J. Ren and L. Wang, “Three-Branch Random Forest Intrusion Detection Model,” Mathematics, vol. 10, no. 23, p. 4460, Nov. 2022.
B. Yogesh and D. G. S. Reddy, “Intrusion detection System using Random Forest Approach,”
B. A. Tama and K. H. Rhee, “A Combination of PSO-Based Feature Selection and Tree-Based Classifiers Ensemble for Intrusion Detection Systems,” in Advances in Computer Science and Ubiquitous Computing, Singapore: Springer Singapore, vol. 373, pp. 489–495, 2015.
N. Farnaaz and M. A. Jabbar, “Random Forest Modeling for Network Intrusion Detection System,” Procedia Computer Science, vol. 89, pp. 213–217, 2016.
Z. Chen, L. Zhou and W. Yu, “ADASYN−Random Forest Based Intrusion Detection Model,” in 2021 4th International Conference on Signal Processing and Machine Learning, Beijing China: ACM, pp. 152–159, Aug. 2021.
A. Thakkar and R. Lohiya, “A Review of the Advancement in Intrusion Detection Datasets,” Procedia Computer Science, vol. 167, pp. 636–645, 2020.
M. Khudadad and Z. Huang, “Intrusion Detection with Tree-Based Data Mining Classification Techniques by Using KDD,” in Machine Learning and Intelligent Communications, Cham: Springer International Publishing, vol. 227, pp. 294–303, 2018.
M. Belouch, S. El Hadaj and M. Idhammad, “Performance evaluation of intrusion detection based on machine learning using Apache Spark,” Procedia Computer Science, vol. 127, pp. 1–6, 2018.
N. Moustafa and J. Slay, “The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set,” Information Security Journal: A Global Perspective, vol. 25, no. 1–3, pp. 18–31, Apr. 2016.
W. L. Al-Yaseen, Z. A. Othman and M. Z. A. Nazri, “Multi-level hybrid support vector machine and extreme learning machine based on modified K-means for intrusion detection system,” Expert Systems with Applications, vol. 67, pp. 296–303, Jan. 2017.
Y. Xin et al., “Machine Learning and Deep Learning Methods for Cybersecurity,” in IEEE Access, vol. 6, pp. 35365-35381, 2018.
R. Panigrahi and S. Borah, “A detailed analysis of CICIDS2017 dataset for designing Intrusion Detection Systems,” International Journal of Engineering & Technology, vol.7, no.3.24, pp. 479-482, 2018.
G. Louppe, 2014, “Understanding Random Forests: From Theory to Practice,” arXiv, doi.org/10.48550/arXiv.1407.7502
Y. Mirsky, T. Doitshman, Y. Elovici and A. Shabtai, 2018, “Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection,” arXiv, doi:10.48550/ARXIV.1802.09089.
C. Zhang, F. Ruan, L. Yin, X. Chen, L. Zhai and F. Liu, “A Deep Learning Approach for Network Intrusion Detection Based on NSLKDD Dataset,” in 2019 IEEE 13th International Conference on Anti-counterfeiting, Security, and Identification (ASID), Xiamen, China: IEEE, pp. 41–45, Oct. 2019.
S. M. Lundberg et al., “From local explanations to global understanding with explainable AI for trees,” Nature Machine Intelligence, vol. 2, no. 1, pp. 56–67, Jan. 2020.
E. Yusuf Gu ̈ven, S. Gu ̈lgu ̈n, C. Manav, B. Bakır and G. Zeynep Gu ̈rka ̧s Aydın, “Multiple Classification of Cyber Attacks Using Machine Learning,” Electrica, vol. 22, no. 2, pp. 313–320, Jun. 2022.
Canadian Institute for Cybersecurity, “IDS 2017 Datasets,” University of New Brunswick. [Online]. Available: https://www.unb.ca/cic/datasets/ids-2017.html
N. V. Chawla, K. W. Bowyer, L. O. Hall and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” Journal of Artificial Intelligence Research, vol. 16, no. 1, pp. 321-357, Jun. 2002.
X. -Y. Liu, J. Wu and Z. -H. Zhou, “Exploratory Undersampling for Class-Imbalance Learning,” in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 39, no. 2, pp. 539-550, Apr. 2009.
M. Lanvin, P.-F. Gimenez, Y. Han, F. Majorczyk, L. M ́e, and E ́. Totel, “Errors in the
CICIDS2017 Dataset and the Significant Differences in Detection Performances It Makes,” in Risks and Security of Internet and Systems, Cham: Springer Nature Switzerland, vol. 13857, pp. 18–33, 2023.
I. Sharafaldin, A. Habibi Lashkari and A. A. Ghorbani, “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization:,” in Proceedings of the 4th International Conference on Information Systems Security and Privacy, Funchal, Madeira, Portugal: SCITEPRESS - Science and Technology Publications, pp. 108–116, 2018.
R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, A. Al-Nemrat and S. Venkatraman, “Deep Learning Approach for Intelligent Intrusion Detection System,” IEEE Access, vol. 7, pp. 41525–41550, 2019.
Y. Zhou, G. Cheng, S. Jiang and M. Dai, “Building an efficient intrusion detection system based on feature selection and ensemble classifier,” Computer Networks, vol. 174, p. 107247, Jun. 2020.
Y. Li et al., “Robust detection for network intrusion of industrial IoT based on multi-CNN fusion,” Measurement, vol. 154, p. 107450, Mar. 2020.