การพัฒนาโมเดลการจำแนกรูปภาพตามความหมายจากภาษาธรรมชาติ ด้วยปัญญาประดิษฐ์เชิงสร้างสรรค์

Sooksawaddee Nattawuttisit

pdf

Published: Nov 29, 2024

Keywords:

image classification deep learning natural language GANs

Sooksawaddee Nattawuttisit

Master of Science Program and Doctor of Philosophy Program Faculty of Information Technology, Sripatum University

Abstract

This research aims to develop an image classification model using Generative Adversarial Networks (GANs) to improve image retrieval and interpretation through natural language processing. This technology generates new content by learning from existing data and producing outputs similar to the original samples. The study's sample data is drawn from the Flickr 30K dataset, consisting of 158,915 entries of images and natural language descriptions. A sample size of 384 entries was determined using Cochran's formula with a 95% confidence level and a 5% margin of error. The data was split into training and testing sets at a ratio of 80/20 to optimize the model's performance in image interpretation. The model's performance was evaluated based on the similarity between the AI-predicted outcomes and the images with descriptions and validated by AI experts. The test results showed an accuracy of 82%, a recall of 78%, and a precision of 80%, indicating the model's effectiveness in interpreting images based on natural language descriptions. This research has commercial applications, such as automatic image categorization on social media or image retrieval in large-scale databases. Future model development should focus on improving recall to enhance completeness and better meet user needs.

How to Cite

Nattawuttisit, S. (2024). Developing Next-Generation Semantic Image Classification Model Through Generative Adversarial Networks (GANs). PKRU SciTech Journal, 8(2), 79–90. retrieved from https://ph01.tci-thaijo.org/index.php/pkruscitech/article/view/257817

Issue

Vol. 8 No. 2 (2024): July - December

Section

Research Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

The original content that appears in this journal is the responsibility of the author excluding any typographical errors.
The copyright of manuscripts that published in PKRU SciTech Journal is owned by PKRU SciTech Journal.

References

Goodfellow, I., Bengio, Y., & Courville, A. (2023). Deep Learning. MIT Press.

Alif, M. D. N., & Fahrudin, N. F. (2024). Performance Analysis of Oversampling and Undersampling on Telco Churn Data Using Naive Bayes, SVM And Random Forest Methods (pp 1–13). In E3S Web of Conferences, 484, 02004.

Zhang, Y., Wang, S., & Li, H. (2023). Enhancing image classification with convolutional neural networks: a comprehensive review. Journal of Computer Vision and Image Processing, 45(2), 123–140.

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8), 9.

Kingma, D. P., & Welling, M. (2014). Auto-Encoding Variational Bayes (pp 1–14).

In The International Conference on Learning Representations (ICLR) 2014.

Papers with Code. (2024). Flickr30k Dataset. [Online], Retrieved from https://paperswithcode.com/dataset/flickr30k (7 August 2024).

จักรินทร์ สันติรัตนภักดี และ ศุภกฤษฏิ์ นิวัฒนากูล. (2567). แบบจำลองการจำแนกรูปภาพตามความหมาย ได้รับการฝึกฝนสำหรับการเรียกค้นรูปภาพโดยใช้ภาษาธรรมชาติ. วารสารวิชาการซายน์เทค มรภ.ภูเก็ต, 8(1), 68–82.

Vivekananthan, S. (2024). Comparative analysis of generative models: Enhancing image synthesis with VAEs, GANs and stable diffusion. arXiv, 2408.08751.

Hassan, R. T., & Ahmed, N. S. (2023). Evaluating of efficacy semantic similarity methods for comparison of academic thesis and dissertation texts. Science Journal of University of Zakho, 11(3), 396–402.

Nattawuttisit, S., & Chantron, P. (2024). Revolutionizing AI driven innovations in gemstone classification: a synergistic approach integrating visual and semantic NLP techniques. Nanotechnology Perceptions, 20(4), 333–345.

Article Sidebar

Main Article Content

Abstract

Article Details

References