Optical Character Recognition (OCR) enhancement using an approximate string matching technique
Main Article Content
Abstract
Many researchers have focused on improving optical character recognition (OCR) efficiency by developing new techniques using image processing based methodologies. However, the major limitations of image processing techniques are their complexity and computational intensity. Thus, they are not applicable to some real-time application. The main highlight of this paper is that we present a new method for enhancing OCR using a simple approximate string matching technique to complement existing OCR algorithms. The experimental results revealed that the proposed methods can enhance the performance of OCR algorithms measured by precision. The accuracy of Thai word recognition was increased by up to 85.72% compared to use of traditional OCR techniques.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
Schantz HF. History of OCR, optical character recognition. Recognition Technologies Users Association; 1982.
Neto R, Fonseca N. Camera reading for blind people. Procedia Technol 2014;16:1200–9. doi:10.1016/j.protcy.2014.10.135.
Singh AK, Gupta A, Saxena A. Optical character recognition: a review. Int J Emerg Technol Innov Res n.d.;3:142–6.
Nuance. OmniPage 2017. https://www.nuance.com/print-capture-and-pdf-solutions/optical-character-recognition/omnipage.html.
Daðason JF. Post-correction of Icelandic OCR text. Thesis. University of Iceland, 2012.
Borji A, Hamidi M. Support vector machine for persian font recognition. Eng Technol 2007;2:10–3.
Ramanathan R, Soman KP, Thaneshwaran L, Viknesh V, Arunkumar T, Yuvaraj P. A novel technique for english font recognition using support vector machines. 2009 Int. Conf. Adv. Recent Technol. Commun. Comput., 2009, p. 766–9. doi:10.1109/ARTCom.2009.89.
Leelasantitham A, Kiattisin S. A position-varied plate utilized for a Thai license plate recognition. Proc. SICE Annu. Conf. 2010, 2010, p. 3303–7.
Leesom N, Surinta O. Thai handwritten character segmentation from digital image documents, 2007, p. 1–10.
Sangkathum O. Printed Thai character recognition using conditonal random fields and hierarchical centroid distance. National Institute of Development Administration (NIDA), 2013.
Fragoso V, Gauglitz S, Zamora S, Kleban J, Turk M. TranslatAR: A mobile augmented reality translator. 2011 IEEE Workshop Appl. Comput. Vis. WACV, 2011, p. 497–502. doi:10.1109/WACV.2011.5711545.
Martínez-Carballido J, Alfonso-López R, Ramírez-Cortés JM. License plate digit recognition using 7x5 binary templates at an outdoor parking lot entrance. 21st Int. Conf. Electr. Commun. Comput., 2011, p. 18–21. doi:10.1109/CONIELECOMP.2011.5749324.
Abdurrahman MW. Developing mobile sunda-indonesia-inggris translator application using capture camera on android smartphone. IEEE Int. Conf. Intell. Comput. Intell. Syst., 2012, p. 98–103.
Kesorn K, Chimlek S, Poslad S, Piamsa-nga P. Visual Content Representation using Semantically Similar Visual Words. Expert Syst Appl 2011;38:11472–11481.
i2OCR. i2OCR - Free Online OCR n.d. http://www.i2ocr.com.
newocr. Free Online OCR n.d. http://www.newocr.com.
ocrapiservice. Ocr Api Service n.d. http://ocrapiservice.com/.
Lowe DG. Distinctive Image Features from Scale-Invariant Keypoints. Int J Comput Vis 2004;60:91–110.