Publication Date:
Author(s): Yashaswini Rajendra Bhat, Kathleen L. Keller, Timothy R. Brick, Alaina L. Pearce
Publisher: Frontiers Media SA
Publication Type: Academic Journal Article
Journal Title: Frontiers in Nutrition
Volume: 12
Abstract:
Introduction: Assessing eating behaviors such as eating rate can shed light on risk for overconsumption and obesity. Current approaches either use sensors that disrupt natural eating or rely on labor-intensive video coding, which limits scalability. Methods: We developed ByteTrack, a deep learning system for automated bite count and bite-rate detection from video-recorded child meals. The dataset comprised 1,440 minutes from 242 videos of 94 children (ages 7–9 years) consuming four meals, spaced one week apart, with identical foods served in varying amounts. ByteTrack operates in two stages: (1) face detection via a hybrid Faster R-CNN and YOLOv7 pipeline, and (2) bite classification using an EfficientNet convolutional neural network combined with a long short-term memory (LSTM) recurrent network. The model was designed to handle blur, low light, camera shake, and occlusions (hands or utensils blocking the mouth). Performance was compared with manual observational coding. Results: On a test set of 51 videos, ByteTrack achieved an average precision of 79.4%, recall of 67.9%, and F1 score of 70.6%. Agreement with the gold-standard coding, assessed by intraclass correlation coefficient, averaged 0.66 (range 0.16–0.99), with lower reliability in videos with extensive movement or occlusions. Discussion: This pilot study demonstrates the feasibility of a scalable, automated tool for bite detection in children’s meals. While results were promising, performance decreased when faces were partially blocked or motion was high. Future work will focus on improving robustness across diverse populations and recording conditions. Clinical trial registration: https://clinicaltrials.gov/study/NCT03341247, identifier NCT03341247.