AI-Driven Sign Language Recognition System with NLP-Enhanced Transcription
Main Article Content
Abstract
Sign language is a critical communication medium for deaf and hard-of-hearing individuals, yet the diversity of over 7,000 sign languages worldwide presents significant challenges for automated recognition systems. This paper presents a novel approach to sign language recognition (SLR) that integrates computer vision techniques with advanced natural language processing (NLP) to improve transcription accuracy and contextual relevance. Our system employs a two-stage architecture: first, a gesture recognition component utilizing MediaPipe Holistic for landmark extraction and Long Short-Term Memory (LSTM) networks for classification; second, a text enhancement module using bidirectional LSTM for contextual correction and grammatical improvement. Experimental results demonstrate that our NLP-enhanced system achieves 98.46% accuracy in gesture recognition while significantly improving the grammatical correctness and contextual coherence of the generated text compared to systems without NLP enhancement. The system can successfully identify missing function words, add appropriate punctuation, and correct grammatical errors in real-time. While primarily focused on American Sign Language (ASL), our approach provides valuable insights for developing more effective and inclusive SLR technologies for various sign languages. These advancements represent a meaningful step toward bridging communication gaps between signing and non-signing individuals, potentially enhancing accessibility in educational, professional, and social environments.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
World Federation of the Deaf, “2019–2023 WFD Report,” Accessed: Feb. 26, 2025. [Online]. Available: https://wfdeaf.org/news/ 2019-2023-wfd-report
R. Rastgoo, K. Kiani and S. Escalera, “Sign language recognition: A deep survey,” Expert Systems with Applications, vol. 164, p. 113794, 2022.
A. Wadhawan and P. Kumar, “Sign language recognition systems: A decade systematic literature review,” Archives of Computational Methods in Engineering, vol. 28, pp. 785-813, 2021.
Y. Chen, J. Zhao, Y. Huang, X. Zhao and L. Zhu, “Lightweight and flexible sign language translation glove,” Nature Electronics, vol. 3, no. 9, pp. 563-569, 2020.
S. A. Mehdi and Y. N. Khan, “Sign language recognition using sensor gloves,” Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP ’02., Singapore, vol.5, pp. 2204-2206, 2002.
H. M. Monisha, K. N. Pramod and B. S. Pooja, “Sign Language Detection and Classification using Hand Tracking and Deep Learning in RealTime,” in International Conference on Advances in Computing, Communication and Electronics (ICACCE), pp. 1-6, 2023.
J. A. Abubakar, I. D. Oduntan and H. Orovwode, “Development of a Sign Language Recognition System Using Machine Learning,” in International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD), pp. 1-6, 2023.
D. Bragg et al., “Sign Language Recognition, Generation, and Translation: An Interdisciplinary Perspective,” in The 21st International ACM SIGACCESS Conference on Computers and Accessibility, pp. 16-31, 2019.
S. Albanie et al., “BSL-1K: Scaling up coarticulated sign language recognition using mouthing cues,” in European Conference on Computer Vision, Springer, pp. 35-53, 2020.
R. A. Kadhim and M. Khamees, “A Real-Time American Sign Language Recognition System using Convolutional Neural Network for Real Datasets,” TEM Journal, vol. 9, no. 3, pp. 937943, 2020.
N. Pugeault and R. Bowden, “Spelling it out: Real-time ASL fingerspelling recognition,” 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, pp. 1114-1119, 2011.
T.-W. Chong and B.-G. Lee, “American Sign Language Recognition Using Leap Motion Controller with Machine Learning Approach,” Sensors, vol. 18, no. 10, p. 3554, 2018.
A. M. Buttar, U. Habib, A. Akram and S A. ikandar, “Real-time American Sign Language Detection using LSTM and YOLOv6,” in International Conference on Innovative Computing (ICIC), pp. 1-6, 2023.
G. Saggio, P. Cavallo, M. Ricci, M. Ercolani, A. Galiano and L. Bianchi, “Sign Language Recognition Using Wearable Electronics: Implementing k-Nearest Neighbors with Dynamic Time Warping and Convolutional Neural Network Algorithms,” Sensors, vol. 20, no. 14, p. 3879, 2020.
C. Lugaresi et al., “MediaPipe: A framework for building perception pipelines,” arXiv preprint, arXiv:1906.8172, 2019.
Google, “MediaPipe Pose Landmarker,” Google AI, Accessed: Feb. 26, 2025.
[Online]. Available: https://ai.google.dev/edge/mediapipe/ solutions/vision/pose_landmarker
Google, “MediaPipe Hand Landmarker,” Google AI Accessed: Feb. 26, 2025. [Online]. Available: https://ai.google.dev/edge/mediapipe/ solutions/vision/hand_landmarker
S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” in Neural Computation, vol. 9, no. 8, pp. 1735-1780, 15 Nov. 1997.
A. Graves and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM and other neural network architectures,” Neural networks, vol. 18, no. 5-6, pp. 602-610, 2005.