What Umpires Get Wrong

InsideTheParker

persists in error
SoSH Member
Jul 15, 2005
40,630
Pioneer Valley
Just wanted to draw people's eyes to this report in the nyt of a study of umpires' strike-zone calls.
 
 
In research soon to be published in the journal Management Science, we studied umpires’ strike-zone calls using pitch-location data compiled by the high-speed cameras introduced by Major League Baseball several years ago in an effort to measure, monitor and reward umpires’ accuracy. After analyzing more than 700,000 pitches thrown during the 2008 and 2009 seasons, we found that umpires frequently made errors behind the plate — about 14 percent of non-swinging pitches were called erroneously.

Some of those errors occurred in fairly predictable ways. We found, for example, that umpires tended to favor the home team by expanding the strike zone, calling a strike when the pitch was actually a ball 13.3 percent of the time for home team pitchers versus 12.7 percent of the time for visitors.

Other errors were more surprising. Contrary to the expectation (or hope) that umpires would be more accurate in important situations, we found that they were, in fact, more likely to make mistakes when the game was on the line. For example, our analyses suggest that umpires were 13 percent more likely to miss an actual strike in the bottom of the ninth inning of a tie game than in the top of the first inning, on the first pitch.

We also found that the pitch count had an influence over the umpire’s perception of a pitch. When the count was 3-0, and another ball would end the at-bat, the umpires mistakenly called a strike 18.6 percent of the time, compared with a 14.7 percent error rate when the count was 0-0. But when the count was 0-2, with another strike yielding a strikeout, the umpires expanded the strike zone only 7.3 percent of the time, half the error rate for 0-0. The umpires, in other words, appeared biased against ending an at-bat.
There's some other stuff about racism and favoring All-Star pitchers. I found it a nice study that seems to verify what most of us think we see all the time.
http://www.nytimes.com/2014/03/30/opinion/sunday/what-umpires-get-wrong.html?action=click&contentCollection=Baseball&module=MostEmailed&version=Full&region=Marginalia&src=me&pgtype=article
 

geoduck no quahog

not particularly consistent
Lifetime Member
SoSH Member
Nov 8, 2002
13,024
Seattle, WA
...we studied umpires’ strike-zone calls using pitch-location data compiled by the high-speed cameras introduced by Major League Baseball several years ago in an effort to measure, monitor and reward umpires’ accuracy.
 
 
In other words, "we used the machine that exists in every park which immediately and accurately identifies balls and strikes as a test against the humans who consistently miss identifying balls and strikes in an effort to see how often the subjective humans blow their responsibility when an objective method is tested and available..."
 

Sampo Gida

Member
SoSH Member
Aug 7, 2010
5,044
A related article here focusing on park effects.
 
http://www.fangraphs.com/blogs/ballpark-strike-zone-factors/
 
For whatever reason, 7% more pitches out of the zone get called strikes at YS3 than at other parks.  At Fenway, 3% fewer pitches out of the zone are called strikes
 
I immediately thought Mo, but then I realized he could not have had that much impact. since he pitched only about 40 innings at YS3 in the 2 years of the study, or about 1.5% of the total.
 

SumnerH

Malt Liquor Picker
Dope
SoSH Member
Jul 18, 2005
32,059
Alexandria, VA
geoduck no quahog said:
In other words, "we used the machine that exists in every park which immediately and accurately identifies balls and strikes as a test against the humans who consistently miss identifying balls and strikes in an effort to see how often the subjective humans blow their responsibility when an objective method is tested and available..."
Why do you claim this?  We definitely see commentary from our PitchFX people (jnai et al) about the calibration for PitchF/X being off in a particular park, or the top and bottom of the strikezone are wrong, or about how the strike zone overlay should be ignored in favor of some other box, etc.  
 
 
OttoC said:
I spoke with Mike Port this weekend and he feels that there still is a big problem with the determination of the bottom of the strike zone with the aotomautic systems now in use.
 
 
Jnai said:
A few things:
 
The parameters that TBS is using for the top and bottom of the strikezone are probably bad. They may be using sz_top and sz_bot (parameters in the gameday files that are supposedly calibrated for the strikezone of each batter which are known to be crap) or they may be using fixed top and bottom values, but those values are almost certainly incorrect. This is making the strikezone appear much worse, in some cases, than the actual zone called by the umpire.
I believe the article is flawed in assuming that if the cameras and umpire differ on a call then the umpire is always wrong and the cameras are always right. But the cameras are unlikely to be biased by situation and circumstance, so the parts of the article about preferential treatment for star pitchers, situational blown calls, racism, and the like are probably at least correct in direction.
 

Hoplite

New Member
Oct 26, 2013
1,116
I hate when umpires do this.
 
 
 
We also found that the pitch count had an influence over the umpire’s perception of a pitch. When the count was 3-0, and another ball would end the at-bat, the umpires mistakenly called a strike 18.6 percent of the time, compared with a 14.7 percent error rate when the count was 0-0.
 

Rice4HOF

Member
SoSH Member
Jan 21, 2002
1,903
Calgary, Canada
I haven't clicked on the link, but more strikes on 3-0, and less on 0-2 is an old observation that was quantified in this study in the Hardball Times. And that one has always irked me. Whenever I call a 3-0 pitch that misses the plate by a few inches a ball and I hear the coach or fans complain "c'mon blue you gotta give him that on 3-0", I want to ask them where the rule book says the zone changes depending on the count.
 
Although in the umpire's defense, widening the strike zone in late innings of a blowout is something that probably both teams appreciate, but it may show up in the stats as missed calls.
 

OBguy

New Member
Jul 14, 2005
1
Hi, I’m one of the authors (Jerry Kim) of the NY Times article. I’m a Sox fan and longtime lurker at SOSH (since 2003), so it was quite exciting to see the article discussed in the forum.
 
I thought those in the thread would be amused to know that the inspiration for the study was in fact Mariano Rivera and Derek Jeter, and how they always seemed to get calls to go in their favor. (And judging from the Game Threads, I knew I wasn’t alone in my frustration.) Once I found out about the PitchFx data—again, mostly from the posts on SOSH—I thought it would be the perfect opportunity to confirm that the Yankees and their stars benefit from umpire generosity.
 
To SumnerH’s point, we are fully aware that the PitchFx data is flawed, and honestly, we weren’t too comfortable with the Times pushing the headline around the raw error rate. (But nuance is probably not a strong suit of modern journalism, even the Times.) In the actual paper, we are much more focused on the situational factors such as All-Star or prior performance of the players, which as SumnerH points out, wouldn’t be correlated with any camera calibration errors, and thus, less problematic.
 
As for the “Mariano Strike Zone”, the raw data suggests that it is quite real. He got around 21% of pitches outside the zone called a strike (compared to the average pitcher getting 13%) in 2008-2009. Now part of that is because he was so goddamn good and threw a ton of pitches right around the border. While our aggregate pitch-level models account for the distance of the pitch from the border, we can’t really extrapolate this to a comparison of one pitcher versus other pitchers. (We would have to find a “counterfactual” Mariano Rivera, who has a similar portfolio of pitch locations, but is not actually him, which is sort of impossible given that he is so unique.)
 
That said, I was playing around with the 2013 data, and was curious to see what the effect (if any) of the “Mariano farewell tour” was. The raw numbers were quite amazing: In 2013, he got a whopping 25% of his pitches outside the zone called a strike. Since it’s not likely that he dramatically increased his ability to locate pitches near the border since 2008-2009, I think it’s safe to say that the umpires were cutting him extra slack. (Makes me a bit worried about the captain’s final season.)
 
Anyhow, just thought it was really cool that you guys were discussing the article, and wanted to chime in.
 
 
 

terrisus

formerly: imgran
SoSH Member
Hey Jerry, thanks for dropping in, and for this to finally pull you out of a decade of lurking. And thanks for answering the questions that we had and providing some insight into those specific situations. Definitely not surprising at all, but it's very meaningful to get some actual data behind the feeling that was the case.
 
Even if, as you and Sumner noted, the data being worked with aren't perfect, it's definitely good to see some meaningful analysis being done with it in a paper of record. (And also good to know we have a Red Sox fan hanging out in said paper as well).
 

ToeKneeArmAss

Paul Byrd's pitching coach
Lifetime Member
SoSH Member
Jerry - great work and thanks for joining in.

As for the bit about how Mariano gets more balls called strikes since he's always so close to the plate, I think that's testable.

One would need to carve the edges of the zone into bite-sized pieces (say 1in squares) then examine the frequency with which a pitch in each square on average was called a strike.

You could then compute the number of called strikes a pitcher could expect to get given his pitch scatter. Then compare that to the actual number of called strikes received and we have a pretty good proxy for how umpire-friendly (or unfriendly) a particular pitcher is.
 

Al Zarilla

Member
SoSH Member
Dec 8, 2005
59,496
San Andreas Fault
OBguy said:
Hi, I’m one of the authors (Jerry Kim) of the NY Times article. I’m a Sox fan and longtime lurker at SOSH (since 2003), so it was quite exciting to see the article discussed in the forum.
 
I thought those in the thread would be amused to know that the inspiration for the study was in fact Mariano Rivera and Derek Jeter, and how they always seemed to get calls to go in their favor. (And judging from the Game Threads, I knew I wasn’t alone in my frustration.) Once I found out about the PitchFx data—again, mostly from the posts on SOSH—I thought it would be the perfect opportunity to confirm that the Yankees and their stars benefit from umpire generosity.
 
To SumnerH’s point, we are fully aware that the PitchFx data is flawed, and honestly, we weren’t too comfortable with the Times pushing the headline around the raw error rate. (But nuance is probably not a strong suit of modern journalism, even the Times.) In the actual paper, we are much more focused on the situational factors such as All-Star or prior performance of the players, which as SumnerH points out, wouldn’t be correlated with any camera calibration errors, and thus, less problematic.
 
As for the “Mariano Strike Zone”, the raw data suggests that it is quite real. He got around 21% of pitches outside the zone called a strike (compared to the average pitcher getting 13%) in 2008-2009. Now part of that is because he was so goddamn good and threw a ton of pitches right around the border. While our aggregate pitch-level models account for the distance of the pitch from the border, we can’t really extrapolate this to a comparison of one pitcher versus other pitchers. (We would have to find a “counterfactual” Mariano Rivera, who has a similar portfolio of pitch locations, but is not actually him, which is sort of impossible given that he is so unique.)
 
That said, I was playing around with the 2013 data, and was curious to see what the effect (if any) of the “Mariano farewell tour” was. The raw numbers were quite amazing: In 2013, he got a whopping 25% of his pitches outside the zone called a strike. Since it’s not likely that he dramatically increased his ability to locate pitches near the border since 2008-2009, I think it’s safe to say that the umpires were cutting him extra slack. (Makes me a bit worried about the captain’s final season.)
 
Anyhow, just thought it was really cool that you guys were discussing the article, and wanted to chime in.
 
 
Jerry, there was another great one who is revered around here I'm sure you know had a lot of borderline pitches go his way: Ted Williams. Umps knowing about his early career 20:15, or even 20:13 vision, as well as the facts of his ridiculous hitting prowess and that he almost never argued with umpires were the reasons you hear. Comes with the territory of greatness I guess. Greg Maddux is another one that got the superstar treatment, it is written. Seeing Williams' treatment made me a bit less resentful of Mo, I suppose.Thanks for chiming in and don't be a stranger!
 
M

MentalDisabldLst

Guest
Jerry, thanks for dropping by.  You've got your hands on some cool stuff!
 
OBguy said:
As for the “Mariano Strike Zone”, the raw data suggests that it is quite real. He got around 21% of pitches outside the zone called a strike (compared to the average pitcher getting 13%) in 2008-2009. Now part of that is because he was so goddamn good and threw a ton of pitches right around the border. While our aggregate pitch-level models account for the distance of the pitch from the border, we can’t really extrapolate this to a comparison of one pitcher versus other pitchers. (We would have to find a “counterfactual” Mariano Rivera, who has a similar portfolio of pitch locations, but is not actually him, which is sort of impossible given that he is so unique.)
 
I may be missing something, but I have to believe this is a fairly simple barrier to overcome.  At whatever pitch-location granularity you've got, find all the pitches thrown of a similar type (maybe not just cutters but all FB types should do), in the same count.  Compare the error rate there to Mariano's experienced error rate, and you've got a fair comparison.  Aggregate that across all the pitch locations you're dealing with, and you've got a profile that should prove or disprove the thesis.  You don't need these other pitches to be thrown by a pitcher of a similar profile (maybe same-handedness, but that's all) to demonstrate a systematic, predictable error associated with Rivera.  I wouldn't limit it to one season, I think his whole career is sufficient to spot a deviation.  Plus picking one year feels like cherry-picking.
 
Likewise, I can't be the only Sox fan who'd be interested in a similar Derek Jeter study, particularly on inside pitches, just out of a sheer sense of schadenfreude.
 

ToeKneeArmAss

Paul Byrd's pitching coach
Lifetime Member
SoSH Member
MentalDisabldLst said:
 
Likewise, I can't be the only Sox fan who'd be interested in a similar Derek Jeter study, particularly on inside pitches, just out of a sheer sense of schadenfreude.
You mean like figuring out the value of the Jeterian Ass-Jut?

Wouldn't that sort of be making the intangibles tangible? ;)
 
M

MentalDisabldLst

Guest
I much prefer tangible asses to intangible ones.  Or ass-juts.  Whatever.
 
This dataset is like taking the red pill from the Matrix in terms of knowing umpires' abilities.  You've felt it your entire life - that there's something wrong with umpiring. You don't know what it is, but it's there, like a splinter in your mind, driving you mad. It is this feeling that has brought this study to us.
 
Bring on our robot-ump overlords.