CrossFit Open 17.5

What do 150k results say about the CrossFit world?

Maximilian Zavodny

Creator, Observatory

Intro for non-CrossFitters: CrossFit Open 17.5 is a workout performed for time as part of a world-wide fitness competition. Anyone and everyone is invited to compete, and this year, hundreds of thousands of athletes from around the world participated. You can learn more about the CrossFit Open, CrossFit Games, this particular workout, (as well as how to Get Started with CrossFit) from CrossFit’s websites.


CrossFit Open 17.5 was a nasty couplet of Thrusters and Double Unders, done for time with a 40-minute time cap. Beyond The Whiteboard has already done a great high-level overview of the workout results, but I wanted to dive a little deeper, so I pulled the data myself and found some unexpected and interesting patterns and anomalies.

For this analysis, I’m looking at only the athletes that completed the workout Rx. First, here’s a summary of the key statistics and a look at the overall distribution of completion times:

Rx Males:

  • # Attempting: 102,982
  • # Completing 96,044 (93.3%)
  • # Completing in < 20 minutes 64,241 (62.4%)
  • # Completing in < 10 minutes 5,940 (5.8%)
  • 50th percentile: 17:38
  • 90th percentile: 11:02

Rx Females:

  • # Attempting 47,797
  • # Completing 45,313 (94.8%)
  • # Completing in < 20 minutes 32,015 (67.0%)
  • # Completing in < 10 minutes 2,992 (6.1%)
  • 50th percentile: 16:50
  • 90th percentile: 10:54

Overall the distribution is very wide, highly-skewed, and remarkably similar between males and females. As an interesting side-note, it might at first seem strange that the distributions have such long tails on the slow end of the range. Why don’t these look like symmetric Normal Distributions? My explanation is that they *are* symmetric (and nearly normal) when converted to more appropriate units: average power output. Power is defined as work per unit time, so average power output is proportional to the inverse of the time to complete the workout. If we use the inverse of the completion time as a rough estimate of power output, we get this much more symmetric distribution. Now the time cap is on the left, and the fastest athletes are on the right (highest power output). Other than the cutoff for the time cap, this is much more symmetric.

Psychological Boundaries

Looking more closely at the overall distribution of completion times, there appear to be sharp changes at some key time points. For example, let’s take a closer look at that distribution around the 10 and 20-minute marks:

These prominent discontinuities initially caught my eye, but taking a closer look, they’re actually present throughout most of the distribution:

In all cases, there appears to an excess of results just before the minute-mark, and a dearth of results just after the minute mark. So what’s going on here? Imagine you were about to pick of the jump rope for your final set of 35 and the clock is at 19:15… if you *really push it* and go unbroken you could finish under 20! A time of 19:59 just sounds sooo much better than a time of 20:01. You know that; the coaches know that; the crowd knows that...and so my theory is that bubble-athletes just dig a little deeper in the final seconds to get across the line under the critical mark.

This has at least one practical implication that comes to mind: Suppose you pick the jump rope for the last time and see that the clock is at 19:50. You know you won’t finish under 20, maybe you’re disappointed, but you gut through the final set and collapse knowing you tried your best... but did you really try your best? Statistically, people on the bubble found another gear in the final seconds, even though they were just as wiped out as everyone else. The reality is, probably a lot of people had another gear that they could have reached if they found themselves on the cusp of breaking their goal. Knowing that, my take-away is: don’t let yourself get discouraged when you have a bad day or don’t beat the time you wanted on a WOD - just find that extra gear whether you’re about to finish first or last. And if you don’t think you have an extra gear - think again!

The 10-second Spike Anomaly

There is another odd feature in the data, but this one I don’t have a good explanation for. It can only be seen if we zoom in even further, again around the 20-minute mark. In this plot, every bar represents exactly 1 second, and I’ve labeled every 10th one. There appears to be a spike in the density every 10 seconds...on the “round” numbers: 20:00, 20:10, 20:20, and so on.

The spikes might not look extremely conspicuous when conflated with the normal statistical noise we expect to see in individual 1-second bins, so I did some further analysis to determine whether the pattern is statistically significant. The first thing I did was to produce an approximation for the “expectation” value in each bin, by applying a moving-average window to the density distribution. For example, that curve looks like this, for the male population:

From this, I calculated a point-wise “deviation” for each point in the curve. This is obtained by subtracting the expected number of results from the actual number of results in each bin, and normalizing by the expectation value for the standard deviation in each bin (using Poisson Statistics). That result looks as expected: mean-zero, no long-scale trends and more-or-less homoscedastic:

Nevertheless, every 10th point, on average, *is* overrepresented, compared to the rest. The anomaly I speculated about emerges when we aggregate points according to the time-value modulo 10 seconds. In other words, let’s take all the results at times that end in ‘0’ into one bin; all the results at times ending in ‘1’ into another bin, and so on. If we then look at the number of results in each bin, we get this:

The error bars indicate +/-1 standard deviation, and as I suspected, the point at ‘0’ is a dramatic outlier (the odds of this happening by coincidence is 0% with many decimal places...). In other words, there really are a lot more results having times ending on round-numbers than there should be, statistically speaking. So what is going on here? My explanation for the excess results before big numbers like 10 and 20 minutes can be explained by people hustling in the final seconds, but why would more people report a time of 20:20 than 20:21? Or 15:50 instead of 15:51? Do you think people are rounding their scores? Do you think people are making typos? Is there a bug in the data submission forms? Are judges subconsciously rounding up or down?

Disclaimer: I am not a representative of CrossFit, and I am not the copyright holder of the source data presented here (I read it from the Leaderboard). I’m just a fan of CrossFit and a data scientist.


Click here to download a .csv file with the source data for all 2017 Open Results, which I scraped from the Leaderboard.