The Semantic-Based Image Classification Model Trained for Image Retrieval Using Natural Language

Main Article Content

Chakkarin Santirattanaphakdi
Suphakit Niwattanakul


The objective of this research is to develop a model for image classification based on semantic meaning using pre-trained deep learning models with text and image data, and iterative learning on custom datasets. Evaluation results of the trained models for image retrieval using natural language label were compared against expert-assessed label meanings, revealing that the prediction performance for natural language label under three conditions, namely 1) image descriptive text resembling image label, 2) high-level conceptual text related to the image content, and 3) text describing the qualitative meaning of the image, yielded scores of 0.905, 0.830, and 0.585, respectively. The evaluation results for text describing the qualitative meaning of the image were found to be at a moderate level, as the text in the form of natural language label is considered a high-level concept. Consequently, individual perceptual experiences influenced the evaluation differently based on human cognition principles, as evidenced by the closely aligned prediction results for similar descriptive text for more than one option. Therefore, meaningful image retrieval should emphasize reducing the semantic gap in search queries and assist users by utilizing query terms aligned with image meaning rather than adhering strictly to grammatical language rules. This approach is suggested as a future direction for information retrieval.

Santirattanaphakdi, C., & Niwattanakul, S. (2024). The Semantic-Based Image Classification Model Trained for Image Retrieval Using Natural Language. PKRU SciTech Journal, 8(1), 68–82. Retrieved from
