Optimizing mushroom classification through machine learning and hyperparameter tuning
Main Article Content
Abstract
This research explores the application of machine learning in the classification of mushrooms as poisonous or edible, emphasizing the importance of optimal model performance to ensure food safety. This study compares four classification algorithms-Random Forest, Logistic Regression, Decision Tree, and Naive Bayes-before optimizing the two best models through Hyperparameter Tuning using Grid Search. The proposed method involves Exploratory Data Analysis (EDA), Data Preprocessing, Classification Modeling, Performance Evaluation, and Hyperparameter Tuning. The dataset used is Mushroom Classification data, and the results show that the Random Forest algorithm performs better with ROC values close to 100%, high recall, and good F1-Score. Hyperparameter tuning further improved the ROC and recall of the Random Forest model, emphasizing its adaptability to the nature of the dataset. This research emphasizes the importance of robust data processing and model optimization to achieve accurate and reliable predictions in mushroom classification, contributing to food safety endeavors.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
Meenu M, Xu B. Application of vibrational spectroscopy for classification, authentication and quality analysis of mushroom: a concise review. Food Chem. 2019;289:545-57.
Tu D, Wu F, Lei Y, Xu J, Zhuang W, Zhao Y, et al. Analysis of differences in flavor attributes of soups: a case study on shiitake mushrooms dried from different drying techniques. J Food Compos Anal. 2024;131:106228.
Tongcham P, Supa P, Pornwongthong P, Prasitmeeboon P. Mushroom spawn quality classification with machine learning. Comput Electron Agric. 2020;179:105865.
Brown AJP, Brown GD, Netea MG, Gow NAR. Metabolism impacts upon Candida immunogenicity and pathogenicity at multiple levels. Trends Microbiol. 2014;22(11):614-22.
White J. New classification of mushrooms poisoning. Toxicol Anal Clin. 2018;30(3):157-8.
Tao K, Liu J, Wang Z, Yuan J, Liu L, Liu X. ReYOLO-MSM: A novel evaluation method of mushroom stick for selective harvesting of shiitake mushroom sticks. Comput Electron Agric. 2024;225:109292.
Nagulwar MM, More DR, Mandhare LL. Nutritional properties and value addition of mushroom: a review. Pharma Innov J. 2020;9(10):395-8.
Tarawneh O, Tarawneh M, Sharrab Y, Husni M. Mushroom classification using machine-learning techniques. AIP Conf Proc. 2023;2979(1):030003.
Farshbaf Aghajani P, Soltani Firouz M, Bohlol P. Revolutionizing mushroom identification: improving efficiency with ultrasound-assisted frozen sample analysis and deep learning techniques. J Agric Food Res. 2024;15:100946.
Liu Q, Fang M, Li Y, Gao M. Deep learning based research on quality classification of shiitake mushrooms. LWT. 2022;168:113902.
Özbay E, Özbay FA, Gharehchopogh FS. Visualization and classification of mushroom species with multi-feature fusion of metaheuristics-based convolutional neural network model. Appl Soft Comput. 2024;164:111936.
Chun TH, Hashim UR, Ahmad S, Salahuddin L, Choon NH, Kanchymalay K. Efficacy of the image augmentation method using CNN transfer learning in identification of timber defect. Int J Adv Comput Sci Appl. 2022;13(5):107-14.
Acuña-Zegarra MA, Santana-Cibrian M, Velasco-Hernandez JX. Modeling behavioral change and COVID-19 containment in Mexico: a trade-off between lockdown and compliance. Math Biosci. 2020;325:108370.
Fadillah MI, Aminuddin A, Rahardi M, Abdulloh FF, Hartatik H, Asaddulloh BP. Diabetes diagnosis and prediction using data mining and machine learning techniques. 2023 International Workshop on Artificial Intelligence and Image Processing (IWAIIP); 2023 Dec 1-2; Yogyakarta, Indonesia. USA: IEEE; 2023. p. 110-5.
Kusnawi K, Ipmawati J, Asadulloh BP, Aminuddin A, Abdulloh FF, Rahardi M. Leveraging various feature selection methods for churn prediction using various machine learning algorithms. Int J Informatics Vis. 2024;8(2):897-905.
Akbar MI, Aminuddin A, Abdulloh FF, Rahardi M, Wahyuni SN, Asaddulloh BP. Comparison of machine learning techniques for heart disease diagnosis and prediction. 2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation (ICAMIMIA);2023 Nov 14-15; Surabaya, Indonesia. USA: IEEE; 2023. p. 815-20.
Ekatama RA, Rahardi M, Aminuddin A, Abdulloh FF. Sentiment analysis of electric vehicles in indonesia using support vector machine and naïve bayes. 2023 3rd International Conference on Smart Cities, Automation & Intelligent Computing Systems (ICON-SONICS); 2023 Dec 6-8; Bali, Indonesia. USA: IEEE; 2023. p. 120-5.
Paudel N, Bhatta J. Mushroom classification using random forest and REP tree classifiers. Nepal J Math Sci. 2022;3(1):111-6.
Wang B. Automatic mushroom species classification model for foodborne disease prevention based on vision transformer. J Food Qual. 2022;2022(1):1173102.
Essa IA, Dhanalakshmi R. Machine learning-based classification of edible and poisonous mushrooms: a performance comparison. Int J Res Appl Sci Eng Technol. 2023;11:1364-70.
Ortiz-Letechipia JS, Galvan-Tejada CE, Galván-Tejada JI, Soto-Murillo MA, Acosta-Cruz E, Gamboa-Rosales H, et al. Classification and selection of the main features for the identification of toxicity in Agaricus and Lepiota with machine learning algorithms. PeerJ. 2024;12:e16501.
Ketwongsa W, Boonlue S, Kokaew U. A new deep learning model for the classification of poisonous and edible mushrooms based on improved AlexNet convolutional neural network. Appl Sci. 2022;12(7):3409.
Mujahid M, Kına E, Rustam F, Villar MG, Alvarado ES, De La Torre Díez I. Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering. J Big Data. 2024;11:87.
Vuttipittayamongkol P, Elyan E, Petrovski A. On the class overlap problem in imbalanced data classification. Knowl Based Syst. 2021;212:106631.
Raikwal JS, Saxena K. Performance evaluation of SVM and K-nearest neighbor algorithm over medical data set. Int J Comput Appl. 2012;50(14):35-9.
Heydarian M, Doyle TE, Samavi R. MLCM: Multi-label confusion matrix. IEEE Access. 2022;10:19083-95.