A Real-Time Analysis of Human Face Emotions Using Deep Learning Techniques on Low-Cost Embedded Devices
Main Article Content
Abstract
Emotions are one of the most fundamental human expressions, which can be expressed in a variety of ways, such as by voice, face, or gestures. In the development of systems that interact with humans, it is extremely important to recognize how human emotions react to the system. This article presents the design and development of the YOLOv4-tiny and YOLOv5s deep learning models to analyze emotions from human faces. The model runs on a low-cost embedded device, the Jetson Nano, equipped with a built-in camera. The motion picture received from the camera is identified by detecting and framing faces, then displayed the emotional analysis of the faces in real-time. The model can categorize 7 emotions: anger, disgust, fear, happiness, sadness, surprise, and neutral. The RAF-DB image dataset was used to train and test the models. Upon the evaluation, we found that the YOLOv5s model performed better than the YOLOv4-tiny model in terms of accuracy, with an F1 score of 0.806 compared to 0.774 for the YOLOv4-tiny model. In terms of processing speed, the YOLOv5 model can display video at roughly 11 FPS, while the YOLOv4-tiny can display video at approximately 10.5 FPS.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The articles published are the opinion of the author only. The author is responsible for any legal consequences. That may arise from that article.
References
A. Freitas-Magalhães, “Facial expression of emotion,” in Encyclopedia of Human Behavior (Second Edition), V. S. Ramachandran, Ed. San Diego: Academic Press, 2012, pp. 173–183.
M. Chen, L. Zhang, and J. P. Allebach, “Learning deep features for image emotion classification,” in 2015 IEEE International Conference on Image Processing (ICIP), Sep. 2015, pp. 4491–4495.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, no. 6, pp. 84–90, 2017.
J. Machajdik and A. Hanbury, “Affective image classification using features inspired by psychology and art theory,” in Proceedings of the 18th ACM International Conference on Multimedia, 2010, pp. 83–92.
N. Mehendale, “Facial emotion recognition using convolutional neural networks (FERC),” SN Applied Sciences, vol. 2, no. 3, pp. 446, 2020,
W. Vijitkunsawat and P. Chantngarm, “Comparison of machine learning algorithm’s on self-driving car navigation using nvidia jetson nano,” presented at the 2020 17th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), Phuket, Thailand, 24–27 June, 2020 (in Thai).
P. Inthanon and S. Mungsing, “Detection of drowsiness from facial images in real-time video media using Nvidia Jetson nano,” presented at the 2020 17th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), Phuket, Thailand, 24–27 June, 2020 (in Thai).
S. Chavan, J. Ford, X. Yu, and J. Saniie, “Plant species image recognition using artificial intelligence on jetson nano computational platform,” presented at the 2021 IEEE International Conference on Electro Information Technology (EIT), Mt. Pleasant, MI, USA, 14–15 May, 2021.
Jetson Nano., (2021, Nov.). NVIDIA Developer. [Online]. Available: https://developer.nvidia. com/embedded/jetson -nano
Jetson Nano Developer Kit., (2021, Nov.). NVIDIA Developer. https://developer.nvidia.com/ embedded/jetson-nano-developer-kit
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” arXiv:1506.02640 [cs], 2022.
J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,” arXiv:1612.08242 [cs], 2022.
J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” arXiv:1804.02767 [cs], 2022.
A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal speed and accuracy of object detection,” arXiv:2004.10934 [cs, eess], 2021.
S. Li, W. Deng, and J. Du, “Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild,” in Proceedings 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 2017, pp. 2584–2593.
S. Li and W. Deng, “Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition,” IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 356–370, 2019.