Used Car Price Prediction Using Web Scraping and Machine Learning Models
Main Article Content
Abstract
This study aimed to develop a predictive model for used car prices in Thailand using web scraping techniques and machine learning algorithms. Data were collected from Kaidee.com, One2Car.com, and Chobrod.com, totaling 55,989 records. After data cleaning, 42,823 valid records remained, containing 13 fundamental attributes for model construction. Two experiments were conducted: (1) using only the basic features and (2) incorporating six newly engineered features—car age, annual usage rate, squared mileage, squared engine size, cumulative usage load, and temporal load—to enhance the model’s learning capability. The performance of five models, including XGBoost, Random Forest, LightGBM, CatBoost, and Gradient Boosting, was compared using MAE, RMSE, MAPE, R², and accuracy. The results showed that XGBoost achieved the best prediction performance. With the additional features, the R² value improved from 0.9262 to 0.9419, accuracy increased from 89.42% to 92.38%, and MAPE decreased from 10.58% to 7.62%, indicating that feature engineering significantly enhanced model accuracy. Feature importance analysis revealed that the most influential factors affecting used car prices were fuel type, car type, engine size, brand, and squared engine size. The findings confirm that integrating machine learning with feature engineering substantially improves predictive performance and can serve as a decision-support tool for buyers, sellers, and financial institutions to promote transparency and fairness in Thailand’s used car market.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Article Accepting Policy
The editorial board of Thai-Nichi Institute of Technology is pleased to receive articles from lecturers and experts in the fields of engineering and technology written in Thai or English. The academic work submitted for publication must not be published in any other publication before and must not be under consideration of other journal submissions. Therefore, those interested in participating in the dissemination of work and knowledge can submit their article to the editorial board for further submission to the screening committee to consider publishing in the journal. The articles that can be published include solely research articles. Interested persons can prepare their articles by reviewing recommendations for article authors.
Copyright infringement is solely the responsibility of the author(s) of the article. Articles that have been published must be screened and reviewed for quality from qualified experts approved by the editorial board.
The text that appears within each article published in this research journal is a personal opinion of each author, nothing related to Thai-Nichi Institute of Technology, and other faculty members in the institution in any way. Responsibilities and accuracy for the content of each article are owned by each author. If there is any mistake, each author will be responsible for his/her own article(s).
The editorial board reserves the right not to bring any content, views or comments of articles in the Journal of Thai-Nichi Institute of Technology to publish before receiving permission from the authorized author(s) in writing. The published work is the copyright of the Journal of Thai-Nichi Institute of Technology.
References
S. Bergmann and S. Feuerriegel, “Machine learning for predicting used car resale prices using granular vehicle equipment information,” Expert Syst. Appl., vol. 263, 2025, Art. no. 125640, doi: 10.1016/j.eswa.2024.125640.
L. Bukvić, J. P. Škrinjar, T. Fratrović, and B. Abramović, “Price prediction and classification of used-vehicles using supervised machine learning,” Sustainability, vol. 14, no. 24, 2022, Art. no. 17034, doi: 10.3390/su142417034.
M. Nandan and D. Ghosh, “Pre-owned car price prediction by employing machine learning techniques,” J. Decis. Anal. Intell. Comput., vol. 3, no. 1, pp. 167–184, 2023, doi: 10.31181/jdaic10008102023n.
A. Theppanya and N. Netpradit, “Forecasting the used car price index using time series forecasting methods,” (in Thai), Maejo Bus. Rev., vol. 6, no. 2, pp. 81–95, 2024, doi: 10.14456/mbr.2024.10.
A. AlShared, “Used cars price prediction and valuation using data mining techniques,” M.S. thesis, Rochester Inst. Technol., Rochester, NY, USA, 2021. [Online]. Available: https://repository.rit.edu/theses/11086
V. Singrodia, A. Mitra, and S. Paul, “A review on web scrapping and its applications,” in Proc. Int. Conf. Comput. Commun. Inform. (ICCCI), Coimbatore, India, 2019, pp. 1–6, doi: 10.1109/ICCCI.2019.8821809.
N. Burkart and M. F. Huber, “A survey on the explainability of supervised machine learning,” J. Artif. Intell. Res., vol. 70, pp. 245–317, 2021, doi: 10.1613/jair.1.12228.
B. Rolf et al., “A review on unsupervised learning algorithms and applications in supply chain management,” Int. J. Prod. Res., vol. 63, no. 5, pp. 1933–1983, 2025, doi: 10.1080/00207543.2024.2390968.
L.-Z. Guo, L.-H. Jia, J.-J. Shao, and Y.-F. Li, “Robust semi-supervised learning in open environments,” Front. Comput. Sci., vol. 19, 2025, Art. no. 198345, doi: 10.1007/s11704-024-40646-w.
H. Xie et al., “Reinforcement learning for vehicle to grid: A review,” Advances Appl. Energy, vol. 17, 2025, Art. no. 100214, doi: 10.1016/j.adapen.2025.100214.
J. S. Jhala and D. Anand, “Comparative analysis of supervised learning algorithms for valuating used car prices,” in Proc. Int. Conf. Advancement Comput. Comput. Technol. (InCACCT), Gharuan, India, 2023, pp. 265–270, doi: 10.1109/InCACCT57535.2023.10141827.
R. Nuzulia, A. Misbullah, L. Farsiah, Rasudin, Husaini, and S. A. Nazhifah, “Comparative analysis of XGBoost and random forest for used car price prediction,” in Proc. Int. Conf. Elect. Eng. Inform. (ICELTICs), Banda Aceh, Indonesia, 2024, pp. 125–129, doi: 10.1109/ICELTICs62730.2024.10776051.
N. O. Idris, A. Achban, S. A. Utiarahman, J. Karim, and F. Pontoiyo, “Predicting the selling price of cars using business intelligence with the feed-forward backpropagation algorithms,” in Proc. 5th Int. Conf. Inform. Comput. (ICIC), Gorontalo, Indonesia, 2020, pp. 1–6, doi: 10.1109/ICIC50835.2020.9288594.
R. B. A. Supleo, R. G. De Luna, and A. C. Padilla, “Predicting used car prices in Metro Manila using artificial neural networks on web-scraped data,” in Proc. 7th Int. Conf. Inform. Comput. Sci. (ICICoS), Semarang, Indonesia, 2024, pp. 30–35, doi: 10.1109/ICICoS62600.2024.10636891.
F. Wang, X. Zhang, and Q. Wang, “Prediction of used car price based on supervised learning algorithm,” in Proc. Int. Conf. Netw., Commun. Inf. Technol. (NetCIT), Manchester, U.K., 2021, pp. 143–147, doi: 10.1109/NetCIT54147.2021.00036.
J. D. Apeko, I. O. Osunmakinde, M. M. Abdulgader, and K. C. Nwosu, “Predictive analytics on used car prices using business intelligence of Bayesian networks for sales risk reduction,” in Proc. Int. Conf. Elect., Comput. Energy Technol. (ICECET), Cape Town, South Africa, 2023, pp. 1–6, doi: 10.1109/ICECET58911.2023.10389200.
H. Jing, X. Ye, and S. Manoharan, “Residual value of used car analysis and prediction,” in Proc. Int. Conf. Elect. Comput. Energy Technol (ICECET), Cape Town, South Africa, 2023, pp. 1–6, doi: 10.1109/ICECET58911.2023.10389355.
G. Buturac, “Measurement of economic forecast accuracy: a systematic overview of the empirical literature,” J. Risk Financial Manage., vol. 15, no. 1, 2022, Art. no. 1, doi: 10.3390/jrfm15010001.
J. O’Trakoun, “Business forecasting during the pandemic,” Bus. Econ., vol. 57, no. 3, pp. 95–110, 2022.