Exploration: Pose Estimation with OpenPose and PoseNet

Today we look at pose estimation and accuracy for uses in various applications. When starting this exploration, we looked at the different libraries available out there and started on PoseNet – PyTorch implementation by Ross Wightman and OpenPose by Gines Hidalgo, Zhe Cao, Tomas Simon, Shih-En Wei Hanbyul Joo and Yaser Sheikh from Carnegie Mellon University.

PoseNet is built to run on lightweight devices such as the browser or mobile device where as OpenPose is much more accurate and meant to be ran on GPU powered systems. You can see the performance benchmarks below.

PoseNet Benchmark on mobile devices
openpose_vs_competition.png
This analysis was performed using the same images for each algorithm and a batch size of 1. Each analysis was repeated 1000 times and then averaged. This was all performed on a system with a Nvidia 1080 Ti and CUDA 8

Our first look was on this Olympic Lifting video, comparing the output from both OpenPose and Posenet.

PoseNet

It processed fast, but it had a lot of missed poses throughout the video, which you can tell by the flickering and disappearing skeleton. We then dug in to the data to see how bad it missed.

As you can see it’s pretty noisy and wasn’t able to track very well. So we treated it like any other noisy sensor and did a little cleaning up of the data with some smoothing and filtering.

Before digging too deep into using PoseNet, we proceeded to look at OpenPose to evaluate it’s accuracy.

OpenPose

We were inspired by how accurate OpenPose could be from the many sources out there using it for various projects, such as Everybody Dance Now by Caroline Chan, Shiry Ginosar, Tinghui Zhou and Alexei A. Efros.

Completely blown away by the accuracy here, we compare it’s output to PoseNet.

We haven’t had the chance to dig into it’s data yet, but already we can tell that it’s much more accurate and tracks well. It will need to be smoothed and we’ve seen some work utilizing Savgol filtering that we’ll be exploring.

2 comments

  1. Dario Sortino

    Dear ParleyLabs,
    I am Dario, a Politecnico di Milano Master student (Biomedical Engineering), and I just started working on my thesis project. It involves pose identification from PoseNet, so I found your article online (https://parleylabs.com/2020/01/05/exploration-pose-estimation-with-openpose-and-posenet/#:~:text=PoseNet%20is%20built%20to%20run,see%20the%20performance%20benchmarks%20below.&text=Our%20first%20look%20was%20on,from%20both%20OpenPose%20and%20Posenet.). I was curious about the filtering process that you have developed for the PoseNet skeleton data since it is very noisy as you said.

    I’m looking forward to hear you!

    All the best,
    Dario.

    1. suprnrdy

      Hey Dario! I opted to use OpenPose, but I’m sure this would work for PoseNet also. What I did was to go through the dataframe of pose points for each joint and doing a few filtering steps, and I was only able to do this post processing.

      1) Filter out data that is obviously erroneous. I did this by comparing previous frame point with current frame point and if the points were off by a standard deviation, I would set the current point equal to the previous point.
      2) I would run back through the whole dataframe and run a smoothing function on the points to get something that was a bit more accurate.

      Hope this helps!

Leave a Reply

Your email address will not be published. Required fields are marked *