An analysis of nearly 4 million pitches shows just how many mistakes umpires make

Baseball is back, and fans can anticipate another season of amazing catches, overpowering pitching, tape-measure home runs – and, yes, controversial calls that lead to blow-ups between umpires and players.

Home plate umpires are at the heart of baseball; every single pitch can require a judgment call. Yet ask any fan or player, and they’ll tell you that many of these calls are incorrect – errors that can affect strategy, statistics and even game outcomes.

Just how many mistakes are made?

Comprehensive umpire performance statistics are not readily known, tracked or made available. Major League Baseball doesn’t seem interested in sharing the historical data.

Could it be because the numbers aren’t flattering?

Luckily, every MLB pitch is tracked and made available – numbers then have to be accessed, downloaded, sorted and evaluated. This takes time and computing power. In a new study with support from a team of Boston University graduate students, we closely analyzed how many balls get called strikes and vice versa. The accuracy of all home plate umpires was ranked and age and experience taken into account.

While the human element of the game certainly adds color, our results show that it comes at a high cost: far too many mistakes.

Mining the data

All 30 Major League Baseball stadiums are outfitted with triangulated tracking cameras that follow baseballs from the pitcher’s hand until it crosses home plate. Ball location can be tracked up to 50 times during each pitch, and accuracy is said to have a margin of error of 1 inch. This information is used to evaluate players, but MLB doesn’t share the results in a way that allows fans to easily evaluate the performance of umpires.

We analyzed nearly 4 million pitches over the course of the last 11 regular seasons. This data, which had been collected by MLB-owned Statcast and Pitch f/x, was sorted, formatted and superimposed on a standard strike zone map.

An example of balls and strikes superimposed over a strike zone from a 2010 game between the Boston Red Sox and Toronto Blue Jays. The red points were called strikes, and the green points were called balls. Pitch F/X
Using this available technology, we measured ball and strike calls for accuracy. We then ranked the error rates for each active umpire, creating a “Bad Call Ratio.” The higher the ratio, the worse the umpire.

The findings were troubling.