Development of A Food Categories and Calories Estimation Full Stack System Based on Multi-CNNs Structures

Main Article Content

Kanjanapan Sukvichai
Warayut Muknumporn


Humans require different food amounts and nutrition depended on age, gender and health. Amount of food intake can create health problems especially for infants, elderly or diabetics. Tradition nutrition booklet is not suitable for most people since it is hard to understand. Thai-foods are hard to extract the nutrition and most of the Thai dishes are not in the book since it focused on Western dishes. This research focused on development on a full stack AI system that categorizes Thai-food dishes, classifies and localizes the ingredients in each dish and estimate nutrition and calories. Multi-Convolutional Neural Networks (CNNs) are used to achieve these categorize, classify and localize tasks. The designed system is separated into AI backend and Mobile application frontend based on OpenCV in an Android smartphone. MobileNet is used as a food categorizer while You-Only-Look-Once (YOLO) network works as the ingredient’s classifier and localizer. Then, ingredients in the pictures are cropped and passed through traditional image processing algorithm with predetermined parameters to calculate and transformed pixel into real-dimension area referenced by Thai coins. Pixel area of non-uniform shape ingredients are segmented and the nutrition and calories can be estimated via a standard reference lookup table. Full stack system is developed in this research based on RESTful protocol with JSON format that used to communicate between a smartphone and AI server. The designed CNNs and full stack system are trained, tested, verified and deployed then the food image captured from a smartphone application can be used to estimated nutrition and calories. Finally, useful information is display on a smartphone screen.


Download data is not yet available.

Article Details

Research Article


[1] J. Wardle, K. Parmenter, and J. Waller, “Nutrition knowledge and food intake,” Appetite, vol. 34, no. 3, pp. 269–275, Jun. 2000, doi: 10.1006/appe.1999.0311.

[2] T. Waratornpaibul, “Consumption behavior: consumerism food and health-conscious food,” Panyapiwat Journal, vol. 5 no. 2, pp. 255-264, Oct. 2015.

[3] C. Szegedy et al., “Going deeper with convolutions,” in 2015 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Oct. 2015, pp. 1-9.

[4] X. Xia, C. Xu, and B. Nan, “Inception-v3 for flower classification,” in 2017 2nd International Conf. on Image, Vision and Computing (ICIVC), Chengdu, Jun. 2017, pp. 783-787.

[5] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. 25th International Conf. on Neural Information Processing Systems (NIPS’12), Tahoe, NV, USA, 2012, pp. 1097-1105.

[6] M. Abadi, et al., “TensorFlow: A system for large-scale machine learning,” in Proc. 12th USENIX conference on Operating Systems Design and Implementation (OSDI’16), Savannah, GA, USA, Nov. 2016, pp. 265–283.

[7] R. Joseph, D. Santosh, G. Ross, and F. Ali., “You only look once: Unified, real-time object detection,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 779-788.

[8] J. Redmon and A. Farhadi, “YOLOv3: An incremental improvement,” Apr. 2018. [Online]. Available: arXiv:1804.02767.

[9] S. Boonvisut, Nutritive values of Thai foods, Bangkok, Thailand: Nutrition Division, the Department of Health Ministry of Public Health (in Thai), 2001.

[10] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” in 2017 IEEE International Conf. on Computer Vision (ICCV), Venice, Italy, Oct. 2017, pp. 2980-2988.

[11] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT: Real-time instance segmentation,” Oct. 2019. [Online]. Available: arXiv:1904.02689.