Stock Clustering Framework using Financial Ratios: A Case Study in the Stock Exchange of Thailand
DOI:
https://doi.org/10.55003/ETH.420407Keywords:
Stock clustering, Composite index, Financial ratios, Stock exchange of Thailand, InvestmentAbstract
Value investors typically seek undervalued stocks that align with specific financial criteria to maximize their margin of safety. However, manually analyzing the financial data of all listed stocks is a time-intensive process. Furthermore, the market price of a target stock may exceed its intrinsic value, introducing potential investment risks. To address these challenges, this study proposes a stock clustering framework that groups equities based on financial ratio similarity. The proposed framework is designed to streamline the investment decision-making process by recommending stocks with comparable financial profiles as alternatives to those currently attracting investor interest but that may already be overvalued. Multiple clustering algorithms are evaluated to determine the most effective grouping strategy. Empirical back testing using four years of data from the Stock Exchange of Thailand reveals that the Gaussian Mixture Model (GMM) achieves the highest composite performance metric among the tested methods. Additionally, the HDBSCAN algorithm is employed to detect and exclude outlier stocks, thereby enhancing the reliability of the clustering results.
References
X. Wang, T. Lei, Z. Liu and Z. Wang, “Long-memory behavior analysis of China stock market based on Hurst exponent,” in 2017 29th Chinese Control And Decision Conference (CCDC), Chongqing, China, May 28–30, 2017, pp. 1710–1712, doi: 10.1109/ccdc.2017.7978792.
K. Jearanaitanakij and B. Passaya, “Predicting Short Trend of Stocks by Using Convolutional Neural Network and Candlestick Patterns,” in 2019 4th International Conference on Information Technology (InCIT), Bangkok, Thailand, Oct. 24–25, 2019, pp. 159–162, doi: 10.1109/incit.2019.8912115.
F. Liu, X. Li and L. Wang, “Exploring Cluster Stocks based on deep learning for Stock Prediction,” in 2019 12th International Symposium on Computational Intelligence and Design (ISCID), Hangzhou, China, Dec. 14–15, 2019, pp. 107–110, doi: 10.1109/iscid.2019.10107.
J. Lee, R. Kim, Y. Koh and J. Kang, “Global Stock Market Prediction Based on Stock Chart Images Using Deep Q-Network,” IEEE Access, vol. 7, pp. 167260–167277, 2019, doi: 10.1109/access.2019.2953542.
D. Indriyanti and A. Dhini, “Clustering High-Dimensional Stock Data using Data Mining Approach,” in 2019 16th International Conference on Service Systems and Service Management (ICSSSM), Shenzhen, China, Jul. 13–15, 2019, pp. 1–5, doi: 10.1109/icsssm.2019.8887724.
J. H. Moedjahedy, R. Rotikan, W. F. Roshandi and J. Y. Mambu, “Stock Price Forecasting on Telecommunication Sector Companies in Indonesia Stock Exchange Using Machine Learning Algorithms,” in 2020 2nd International Conference on Cybernetics and Intelligent System (ICORIS), Manado, Indonesia, Oct. 27–28, 2020, pp. 1–4, doi: 10.1109/icoris50180.2020.9320758.
Y. Patil and M. Joshi, “Cluster Driven Candlestick Method for Stock Market Prediction,” in 2020 International Conference on System, Computation, Automation and Networking (ICSCAN), Pondicherry, India, Jul. 3–4, 2020, pp. 1–5, doi: 10.1109/icscan49426.2020.9262356.
Y. Shirota and A. Murakami, “Long-term Time Series Data Clustering of Stock Prices for Portfolio Selection,” in 2021 IEEE International Conference on Service Operations and Logistics, and Informatics (SOLI), Singapore, Dec. 11–12, 2021, pp. 1–6, doi: 10.1109/soli54607.2021.9672407.
X. Wang, K. Yang and T. Liu, “Stock Price Prediction Based on Morphological Similarity Clustering and Hierarchical Temporal Memory,” IEEE Access, vol. 9, pp. 67241–67248, 2021, doi: 10.1109/access.2021.3077004.
N. Naik and B. R. Mohan, “Novel Stock Crisis Prediction Technique—A Study on Indian Stock Market,” IEEE Access, vol. 9, pp. 86230–86242, 2021, doi: 10.1109/access.2021.3088999.
T. Leangarun, P. Tangamchit and S. Thajchayapong, “Stock Price Manipulation Detection Using Deep Unsupervised Learning: The Case of Thailand,” IEEE Access, vol. 9, pp. 106824–106838, 2021, doi: 10.1109/access.2021.3100359.
T. Ploysuwan and N. Pravithana, “Thailand Stock Similarity Clustering by Self-Supervised Wavelet transforms,” in 2021 2nd International Conference on Big Data Analytics and Practices (IBDAP), Bangkok, Thailand, Aug. 26–27, 2021, pp. 53–57, doi: 10.1109/ibdap52511.2021.9552076.
J. -S. Kim, S. -H. Kim and K. -H. Lee, “Portfolio Management Framework for Autonomous Stock Selection and Allocation,” IEEE Access, vol. 10, pp. 133815–133827, 2022, doi: 10.1109/access.2022.3231889.
S. Wang, “A Stock Price Prediction Method Based on BiLSTM and Improved Transformer,” IEEE Access, vol. 11, pp. 104211–104223, 2023, doi: 10.1109/access.2023.3296308.
Y. Li, L. Chen, C. Sun, G. Liu, C. Chen and Y. Zhang, “Accurate Stock Price Forecasting Based on Deep Learning and Hierarchical Frequency Decomposition,” IEEE Access, vol. 12, pp. 49878–49894, 2024, doi: 10.1109/ACCESS.2024.3384430.
A. Chakravorty and N. Elsayed, “A Comparative Study of Machine Learning Algorithms for Stock Price Prediction Using Insider Trading Data,” in 2025 IEEE 4th International Conference on Computing and Machine Intelligence (ICMI), Mount Pleasant, MI, USA, Apr. 5–6, 2025, pp. 1–5, doi: 10.1109/icmi65310.2025.11141127.
W. J. Bruns Jr., “Introduction to Financial Ratios and Financial Statement Analysis,” Harvard Business School, Boston, MA, USA, Background Note No. 193-029, 1992(Revised Sep. 2004).
J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Berkeley Symposium on Mathematical Statistics and Probability, L. M. Le Cam and J. Neyman, Eds. Berkeley, CA, USA, 1967, pp. 281–297.
B. J. Frey and D. Dueck, “Clustering by Passing Messages Between Data Points,” Science, vol. 315, no. 5814, pp. 972–976, 2007, doi: 10.1126/science.1136800.
R. R. Sokal and C. D. Michener, “A statistical method for evaluating systematic relationships,” The University of Kansas science bulletin, vol. 38, no. 22, pp. 1409–1438, 1958.
C. E. Rasmussen, “The Infinite Gaussian Mixture Model,” in Advances in Neural Information Processing Systems 12, S. A. Solla, T. K. Leen and K. -R. Müller, Eds. Denver, CO, USA, 1999, pp. 554–560.
R. J. G. B. Campello, D. Moulavi and J. Sander, “Density-Based Clustering Based on Hierarchical Density Estimates,” in Pacific-Asia Conference on Knowledge Discovery and Data Mining 2013, J. Pei, V. S. Tseng, L. Cao, H. Motoda and G. Xu, Eds. Gold Coast, Australia, 2013, pp. 160–172.
P. J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,” Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65, 1987, doi: 10.1016/0377-0427(87)90125-7.
T. Caliński and J. Harabasz, “A dendrite method for cluster analysis,” Communications in Statistics, vol. 3, no. 1, pp. 1–27, 1974, doi: 10.1080/03610927408827101.
D. L. Davies and D. W. Bouldin, “A Cluster Separation Measure,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-1, no. 2, pp. 224–227, 1979, doi: 10.1109/TPAMI.1979.4766909.
L. van der Maaten and G. Hinton, “Visualizing Data using t-SNE,” Journal of Machine Learning Research, vol. 9, no. 86, pp. 2579–2605, 2008.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 School of Engineering, King Mongkut’s Institute of Technology Ladkrabang

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The published articles are copyrighted by the School of Engineering, King Mongkut's Institute of Technology Ladkrabang.
The statements contained in each article in this academic journal are the personal opinions of each author and are not related to King Mongkut's Institute of Technology Ladkrabang and other faculty members in the institute.
Responsibility for all elements of each article belongs to each author; If there are any mistakes, each author is solely responsible for his own articles.


