An Experiment of Thai Speech Recognition in a Noisy Environment using PocketSphinx : A Case of Inventory Counting

Main Article Content

อุดม ได้พร้อม
วีรวุฒิ ทัฬหิกรรม
ดัชกรณ์ ตันเจริญ

Abstract

          Recently, Speech recognition technology play an increasingly important role in everyday life. Speech recognition applications have become very popular and increased convenience of using Smartphone such as voice commands, turn on/off hardware and access the different apps and features on Smartphone. Speech recognition technology can be very helpful for users who are blind or visually impaired because operating Smartphone with difficulties seeing is complicated task. In addition, general Smartphone users can also benefit from this technology such as the rapidity and ease of operation.


          The purpose of developing this project is to apply technology of speech recognition to items counting system, Speech recognition not only makes it easier and quicker to use Smartphone, it also allows hands-free use in situations where our hands are busy such as counting stock items in retail. Speech recognition can improve the working speed by counting item and memorizing quantities without any paper note or typing on devices.


          Recently, Speech recognition technology play an increasingly important role in everyday life. Speech recognition applications have become very popular and increased convenience of using Smartphone such as voice commands, turn on/off hardware and access the different apps and features on Smartphone. Speech recognition technology can be very helpful for users who are blind or visually impaired because operating Smartphone with difficulties seeing is complicated task. In addition, general Smartphone users can also benefit from this technology such as the rapidity and ease of operation.


          The purpose of developing this project is to apply technology of speech recognition to items counting system, Speech recognition not only makes it easier and quicker to use Smartphone, it also allows hands-free use in situations where our hands are busy such as counting stock items in retail. Speech recognition can improve the working speed by counting item and memorizing quantities without any paper note or typing on devices.


        Goal of this project is to develop items counting system by speech that can run on smart phones with the Android operating system, which has increasingly number of users. Furthermore, this system can operate under noisy environment such as office and convenience store. The simulation results show that improvement of speech recognition model can reduce error rate.


Goal of this project is to develop items counting system by speech that can run on smart phones with the Android operating system, which has increasingly number of users. Furthermore, this system can operate under noisy environment such as office and convenience store. The simulation results show that improvement of speech recognition model can reduce error rate.

Article Details

Section
Research Article

References

D. Huggins-Daines, M. Kumar, A. Chan, A.W. Black, M. Ravishankar & A.I. Rudnicky, “Pocketsphinx : A Free, Real-Time Continuous Speech Recognition System for Hand-Held Devices,” in 2006 IEEE International Conference on Acoustics Speech and Signal Processing, Toulouse, France, 2006.

W. Walker, P. Lamere, P. Kwok, B. Raj, R. Singh, E. Gouvea, P. Wolf and J. Woelfel, “Sphinx-4: A Flexible Open Source Framework for Speech Recognition,” in AMLI TR-2004-139, Nov. 2004.

P. Cotsomrong, T. Sunpetchniyom, S. Kasuriya, N. Thatphithakkul & C. Wutiwiwatchai, “LOTUS : Large vOcabulary Thai continuous Speech Recognition Corpus,” in NAC2005, Nonthaburi, Thailand, 2005

บุญเสริม กิจศิริกุล, ณัฐกร ทับทอง, “การพัมนาระบบรู้จำเสียงพูดภาษาไทย,” โครงการเชื่อมโยงการวิจัยภาควิชาวิศวกรรมคอมพิวเตอร์ คณะวิศวกรรมศาสตร์ จุฬาลงกรณ์มหาวิทยาลัย, 2546.

Freeman D.K., Cosier G., Southcott C.B., Boyd., “The Voice Activity Detector for the PAN-European Digital Cellular Mobile Telephone Service,” International Conference on Acoustics, 1989.

Jon P. Nedel, “Duration normalization for robust recognition of spontaneous speech via missing featire methods,” Ph.D. Thesis, Carnegie Mellon University, 2004.

J. Baker, “Stochastic Modeling as a Means of Automatic Speech Recognition,” Ph.D. Thesis, Carnegie Mellon University, 1975.

มนตรี โพธิโสโนทัย, เฉลิมภัณฑ์ ฟองสมุทร, “วิธีการรู้จำเสียงพูดภาษาไทยแบบทนทานต่อเสียงรบกวนภายนอกม” วารสารเทคโนโลยีสารสนเทศ, ฉบับที่ 13, มกราคม-มิถุนายน 2554.