Ergonomic Assessment
Deep Learning for Ergonomic Assessment
Marker based motion capture is currently the most accurate method of measuring human kinematics, but it is limited to tracking objects with reflective markers making it unsuitable for human pose tracking applications in most work environments. The purpose of this study is to quantify the accuracy of pose estimation from a monocular electro-optical sensor with deep learning to infer segment end points and pose estimation. An accuracy assessment is performed with a Vicon Nexus and an iPhone, both running at 240 Hz. Visual 3D joint angles are computed from the marker data. The iPhone view was placed perpendicular to the sagittal plane. Deep learning algorithms produced 3D pose information that is translated into hip, knee, and ankle sagittal plane joint angles.
Pearson r correlations compared MediaPipe joint angles through running, lifting, ladder climbing, and step motion capture data. On the running data, markerless methods showed correlation values compared with Visual 3D of hip (0.968), knee (0.983), and ankle (0.928). The markerless methods have limitations on predicting maximum flexion and extension angles.
Although the correlation values are high, in practice these differences in maximum range of motion may impact any future interpretation of data. Care should be taken when utilizing extreme joint angles when using deep learning algorithms. Although at this point the open-source methods are not as accurate as marker-based motion capture, they could enable the collection of data from workers in a manufacturing environment. Given the ease of data collection, this could facilitate safety tracking applications in work areas that require safe human pose or movements. Real-time pose estimation is demonstrated with a classifier to assess office desk ergonomics.