A Survey of Automatic Indexing Techniques for Thai Text Documents

Todsanai Chumwatana


With the rapidly increasing number of Thai text documents available in digital media and websites, it is important to find an efficient text indexing technique to facilitate search and retrieval. An efficient index would speed up the response time and improve the accessibility of the documents. Up to now, not much research in Thai text indexing has been conducted as compared to more commonly used languages like English or other European languages. In Thai text indexing, the extraction of indexing terms becomes a main issue because they cannot be specified automatically from text documents, due to the nature of Thai texts being non-segmented. As a result, there are many challenges for indexing Thai text documents. The ma-jority of Thai text indexing techniques can be divided into two main categories: a language-dependent technique and a lan-guage-independent technique as will be described in this paper. 

