Developing a lexicon for computerized transformation of the transliterated words of Thai Noi characters into Isan language pronunciations
Keywords:
Lexicon, Transliteration, machine transliteration of ancient language, Machine transliterationAbstract
The objectives of this research were to design and develop the lexicon for the transformation of the transliterated words of Thai Noi characters into Isan language pronunciations by computer and to evaluate the efficiency of the lexicon being developed. The research methodology was separated into 3 steps: 1) the study of the characteristics of Isan lexicons totally from 7 books, and one standard Thai lexicon, 2) the design and development of the lexicon and 3) the evaluation of the efficiency of the lexicon for transformation of the7,929 transliterated words as a testing datainto Isan language pronunciation and the evaluation of the efficacy of the design and development of the lexicon by 12 ancient linguists and 27 users of the ancient Isan medicine textbooks. The results of the research revealed that the Isan lexicons were limited for the transformation of the transliterated words of Thai Noi characters into Isan language pronunciations by computer. This is because main vocabularies were not the transliterated words of Thai Noi characters and had a variety of spelling criteria. And Thai Noi font that will be used as main words do not have a Unicode code. The designed and developed lexicon contained 4,645 main vocabularies, 36 parts of speech, 10 spelling criterias of Isan language pronunciation. The result of the accuracy of transformation of the transliterated ancientIsan medicine recipes into Isan language pronunciation was 71.02 percentand the overall mean of the design and development of the lexicon efficacy was in a good level for all items (ðĨĖ = 3.89, SD = 0.71). Using the main vocabulary as a transliteration reduces the need to build a unique vocabulary for each ancient script.
References
Azizan, A., Jamal, N. N. S. A., Abdullah, M. N., Mohamad, M.,& Khairudin, N. (2019). Lexicon-Based Sentiment Analysis for Movie Review Tweets.In Conference, 2019 1st International Conference on Artificial Intelligence and Data Sciences (AiDAS)(pp. 132-136).
Bianco, G., Bruno, F., Tonazzini, A., Salerno, E., Savino, P., ZitovÃĄ, B., Sroubek, F., & Console E. (2010). A Framework for Virtual Restoration of Ancient Documents by Combination of Multispectral and 3D Imaging. In Proceedings of Eurographics Italian Chapter Conference 2010(pp. 1-7).
Chamchong, R., Fung, C.,& Chun, C. (2011). Character Segmentation from Ancient Palm Leaf Manuscript in Thailand. In Proceedings of the 2011 Workshop on Historical Document Imaging and Processing(pp.140-145).
Chantao, R. (2015). Dictionary of 3 Languages (Thai Noi-Lao-Thai).Khon Kaen University. (in Thai)
Chawsuan, E. (2015). The combinations of consonants or syllables in words found in literature written in the Thai Noi script. Silpakorn Journal, 45-59. (in Thai)
Chueaphun, C., Klomsae, A., Marukatat, S., & Chaijaruwanich, J. (2012). Lanna Dharma Printed Character Recognition using k-Nearest Neighbor and Conditional Random Fields. In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (KDIR-2012)(pp.169-174).
Fang, S., & Xu, L. (2021, May). Creation and Significance of Database of Dictionary of Cognate Words. In Workshop on Chinese Lexical Semantics(pp. 119-129). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-031-06547-7_9
Hollaus, F., Gau, M., & Sablatnig, R. (2012). Multispectral Image Acquisition of Ancient Manuscripts. In Proceedingsof Progress in Cultural Heritage Preservation 4th International Conference(pp.30-39).
Inkeaw, P., Charoenkwan, P., Huang, HL., Marukatat, S., Ho, SY., & Chaijaruwanich, J. (2017). Recognition of handwritten Lanna Dhamma characters using a set of optimally designed moment features. Document Analysis and Recognition (IJDAR), 20(4), 259-274.
Intasorn, J., Gertphol, S., & Sammapun, U. (2021). Thai Sentiment Lexicon Construction. In Conference, 2021 13th International Conference on Knowledge and Smart Technology (KST)(pp. 123-128).
Kaewplang, K. (2013). Thai noi script: Language relation of Thai-Lao cultural. Journal of Culture,52(4), 64-70. (in Thai)
Khamluehan, C. (1992). Special orthography of the Inventive Dhamma script and Thai Noi. Journal of Srinakharinwirot University Maha Sarakham campus,11(1), 33-38. (in Thai)
Kingkham, W. (2013). Thai dialects.Kasetsart University printing house. (in Thai)
Lorattanachaiyong, I. (2017). Generating Thai sentiment lexicon from online reviews[Unpublished masterâs thesis].Chulalongkorn University. (in Thai)
Na Nagara, P. (1998). Guidelines for translating Lanna native characters.Journal of Prasert Na Nagara,71-73. (in Thai)
Natarajan, J., & Sreedevi, I. (2017). Enhancement of ancient manuscript images by log based binarization technique. AEU -International Journal of Electronics and Communications, 75, 15-22.
National Library of Thailand. (2002). Collection of Lao language lexicals in ancient documents. Prachachon. (in Thai)
Office of the Royal Society. (2011). Royal Institute Dictionary 2011(2nded.). Nanme.
Office of the Royal Society. (n.d.). Terminology of the Office of the Royal Society. https://coined-word.orst.go.th. Paiboon, N. (2009). Isan Dhamma alphabet dictionary[Unpublished masterâs thesis]. Khon Kaen University. (in Thai)
Paiboonwangcharoen, P. (1999). Transliteration of ancient documents. Manutsat Paritat: Journal of Humanities,21, 25-30. (in Thai)
Phimworamethakun, B. & Phimworamethakun, N. (2002). Isan language dictionary: Speak Isan. Khlang Nana Witthaya. (in Thai)
Phromsoda, S. (2010). Translation of Isan script into Thai[Unpublished masterâs thesis].Khon Kaen University. (in Thai)
Pinthong, P. (1989). Encyclopedia of Isan -Thai -English languages.Siritham Printing House.
Polsri, J. (2021). The Development of a Framework for Ttranslitering Thai Noi Characters into Isan Language Pronunciation. [Unpublisheddoctoralâs thesis].Technology Suranaree University. (in Thai)
Punnotok, T. (2006). Ancient Thai script, Lai Sue Thai script, and the evolution of Thai script. Chulalongkorn University Printing House. (in Thai)
Raksuthi, S. (2011). Isan-Central Thai Dictionary. Phatthana Sueksa. (in Thai)
Romulus, P., Maraden, Y., Purnamasari, P. D., & Ratna, A. A. P. (2015). An analysis of optical character recognition implementation for ancient Batak characters using K-nearest neighborsâ principle. In the proceedingsof 2015 International Conference on Quality in Research (QiR)(pp. 47-50).Lombok, Indonesia.
Sattanakho, K. (1999). Special orthography in Isan-Lanna script with stone inscription main I. Maha Sarakham Journal,17(2), 64-70. (in Thai)
Somdet Phra Maha Wirawong (Tissamah Thera). (1998). Isan-central Thai dictionary Somdet Phra Maha Wirawong's aspirations edition (Tissamah Thera).Amarin Printing and Publishing. (in Thai)
Somjitsripanya, S. (1982). Thai Noi script, Kab script, Soi script. Memorial of the royal funeral service of Phraratchasinsophit (Tha Phutthasorn Mahathera), the Ecclesiastical Provincial Governor of Mahasarakham Province.(in Thai)

Downloads
Published
Issue
Section
Categories
License
Copyright (c) 2023 Journal of Applied Science and Emerging Technology

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.