Today we look at pose estimation and accuracy for uses in various applications. When starting this exploration, we looked at the different libraries available out there and started on PoseNet – PyTorch implementation by Ross Wightman and OpenPose by Gines Hidalgo, Zhe Cao, Tomas Simon, Shih-En Wei Hanbyul Joo and Yaser Sheikh from Carnegie Mellon University.
PoseNet is built to run on lightweight devices such as the browser or mobile device where as OpenPose is much more accurate and meant to be ran on GPU powered systems. You can see the performance benchmarks below.
Our first look was on this Olympic Lifting video, comparing the output from both OpenPose and Posenet.
It processed fast, but it had a lot of missed poses throughout the video, which you can tell by the flickering and disappearing skeleton. We then dug in to the data to see how bad it missed.
As you can see it’s pretty noisy and wasn’t able to track very well. So we treated it like any other noisy sensor and did a little cleaning up of the data with some smoothing and filtering.
Before digging too deep into using PoseNet, we proceeded to look at OpenPose to evaluate it’s accuracy.
We were inspired by how accurate OpenPose could be from the many sources out there using it for various projects, such as Everybody Dance Now by Caroline Chan, Shiry Ginosar, Tinghui Zhou and Alexei A. Efros.
Completely blown away by the accuracy here, we compare it’s output to PoseNet.
We haven’t had the chance to dig into it’s data yet, but already we can tell that it’s much more accurate and tracks well. It will need to be smoothed and we’ve seen some work utilizing Savgol filtering that we’ll be exploring.