Thai Word Segmentation Technique for Solving Unknown Words and Ambiguous Words Using Rules-Based and Surrounding Contextual Clues
Main Article Content
บทคัดย่อ
This research objective is to develop a word segmentation technique for that can solve problems related to unknown words and ambiguous words
developed, and experimented using these following procedures in order to examine the accuracy of word segmentation. on a dictionary was conducted with the Backtracking technique and the Dictionary B Tree. 3 solved by using 28 invented rules. 4) Probabilities of the word segmentation patterns were analysed syntactically by using a Digraph, in order to choose the syntactically correct. 5) The problem of ambiguous word Dictionary for solving ambiguous words was sample texts. This technique was tested 4 times in order to improve the rules, which resulted in a very high accuracy of 95.65%. This research can be applied for several purposes
text-to-speech conversion, machine translation, and other relevant work
to develop a word segmentation technique for the Thai language that can solve problems related to unknown words and ambiguous words. This technique was these following procedures: 1) 30 documents were selected accuracy of word segmentation. 2) A regular word segmentation based
with the Longest Word Pattern Matching technique, the Dictionary B Tree. 3) The problem of unknown words was Probabilities of the word segmentation patterns were Digraph, in order to choose the pattern that is most
The problem of ambiguous words were solved by using Rules, and the was created. The system was tested with the files of This technique was tested 4 times in order to improve the rules, which resulted in is research can be applied for several purposes, including speech conversion, machine translation, and other relevant work.
developed, and experimented using these following procedures in order to examine the accuracy of word segmentation. on a dictionary was conducted with the Backtracking technique and the Dictionary B Tree. 3 solved by using 28 invented rules. 4) Probabilities of the word segmentation patterns were analysed syntactically by using a Digraph, in order to choose the syntactically correct. 5) The problem of ambiguous word Dictionary for solving ambiguous words was sample texts. This technique was tested 4 times in order to improve the rules, which resulted in a very high accuracy of 95.65%. This research can be applied for several purposes
text-to-speech conversion, machine translation, and other relevant work
to develop a word segmentation technique for the Thai language that can solve problems related to unknown words and ambiguous words. This technique was these following procedures: 1) 30 documents were selected accuracy of word segmentation. 2) A regular word segmentation based
with the Longest Word Pattern Matching technique, the Dictionary B Tree. 3) The problem of unknown words was Probabilities of the word segmentation patterns were Digraph, in order to choose the pattern that is most
The problem of ambiguous words were solved by using Rules, and the was created. The system was tested with the files of This technique was tested 4 times in order to improve the rules, which resulted in is research can be applied for several purposes, including speech conversion, machine translation, and other relevant work.
Article Details
How to Cite
[1]
C. Mahatthanachai, K. Malaivongs, และ N. Tantranont, “Thai Word Segmentation Technique for Solving Unknown Words and Ambiguous Words Using Rules-Based and Surrounding Contextual Clues”, J of Ind. Tech. UBRU, ปี 6, ฉบับที่ 1, น. 1–15, ก.ค. 2016.
บท
บทความวิจัย
บทความที่ได้รับการตีพิมพ์ในวารสารฯ ท้ังในรูปแบบของรูปเล่มและอิเล็กทรอนิกส์เป็นลิขสิทธิ์ของวารสารฯ