Jump to content


Yo! You're not logged in. Why am I seeing this ad?

Photo

The true value of defense?


  • Please log in to reply
13 replies to this topic

#1 Dummy Hoy


  • Angry Pissbum


  • 2,373 posts

Posted 20 March 2009 - 08:15 PM

John Dewan, author of the Fielding Bible, has come up with what he calls "The most amazing, and significant, discovery of my 25 years in the baseball analysis business."

Dewan is convinced that the value of defense is about 50% of that of offense. This is much higher than most people would imagine (no?), so there's a chance this is pretty controversial. His idea is that there is a steady correlation between the difference between best and worst offense and the best and worse defense (based on runs scored).

Check it out.

I love Dewan, and I think he's as good as defensive metrics get, but I'm curious what some of our more statistically inclined members think.

edit: misunderstood Dewan's methodology initially.

Edited by Dummy Hoy, 21 March 2009 - 06:26 AM.


#2 Bellhorn


  • Lumiere


  • 1,520 posts

Posted 20 March 2009 - 11:33 PM

Interesting link, thanks for posting it. The 50% figure may well be accurate, but to verify it, we need to look at the deviation from average across all teams, not just the outliers at both ends. I know that MGL has lists of team defensive runs in the last couple of editions of THT Annual - if I have time tomorrow, I'll try to see what his numbers indicate regarding the respective categories. Or if someone wants to beat me to it and save me some time, so much the better. ;)

There's also the secondary question of whether this magnitude of defensive value only holds true in a retrospective analysis of season events, or whether it's a repeatable skill that can be as reliably forecast as offense from year to year. i.e. is there a greater proportion of random variance in defensive runs saved? I suspect that there might be, as even the best defensive metrics still require multiple seasons of data before they generate reliable estimates of player ability.

#3 OttoC


  • Mr. Excel


  • 6,444 posts

Posted 21 March 2009 - 08:20 AM

In viewing Dewan's claim that defensive runs are worth 50% of offensive runs seems to imply cause-and-effect. I admit to not knowing anything about his defensive metrics but I do wonder how they can show that this isn't simply coincidental.

The percent of unearned runs to runs is reasonably consistent over small periods of years (especially in more modern times) . The rate has steadily decreased over the years of baseball. For example, the rate (UER/RA) for 2000-08 was 7.9% (about 61 UER per team per year) while the rate for the 1930s was 14.4% (about 109 UER per team per year). Assuming that all the data necessary to produce Dewan's fielding metrics is available, would they show this? If they need something like batted-ball strength for calculation, then we will need to wait a while before there is sufficient data to make conclusions.

#4 URI


  • stands for life, liberty and the uturian way of life


  • 8,204 posts

Posted 21 March 2009 - 08:34 AM

I'm guessing by "defense", he means "fielding" and not "run prevention".

And I don't think it's all that groundbreaking.

Consider this: In Win Shares, Bill James' formula basically assumes that, in the grand scheme of things, offense is ~50% of the game, and defense is ~50%. This changes from team to team based on their strengths.

Now, he assumes (using the same generic team idea) that pitching is ~67% of run prevention, and fielding is ~33% (keep in mind that I last looked at this stuff in 03, so if I'm off here, it's my mind playing tricks on me).

Which puts pitching at ~33% of the game (67% of 50%) and fielding at ~17%. Dewan said that fielding is 50% of offense, which puts it at 25% of the game.

Yeah it's more, but not significantly, and still a far cry from the 70% of the game platitudes of old.

#5 Bellhorn


  • Lumiere


  • 1,520 posts

Posted 21 March 2009 - 03:04 PM

In viewing Dewan's claim that defensive runs are worth 50% of offensive runs seems to imply cause-and-effect. I admit to not knowing anything about his defensive metrics but I do wonder how they can show that this isn't simply coincidental.

The point is that when you convert the best fielding metrics to a run value, you end up seeing that, on average, teams are better/worse than average by about half as much as they are by offensive runs. The best offensive team might typically score 200 runs more than an average team, while the best fielding team might save 100 runs (separated from runs saved by their pitching) and so on down the line. It's potentially a surprise, given the short shrift given to defense by sabermetrics until the last few years, that fielding plays such a large role in team success.

I checked MGL's figures in THT 2008 (I don't have a 2009 copy handy) and they are certainly consistent with Dewan's conclusion (though it would take much more than this to confirm it): the mean absolute deviation in fielding runs was 32.2, compared with 55.6 for hitting, 38.3 for pitching, and 6.7 for baserunning. So using his methodology, fielding was actually nearly 60% as important as hitting in the 2007 season.

The percent of unearned runs to runs is reasonably consistent over small periods of years (especially in more modern times) . The rate has steadily decreased over the years of baseball. For example, the rate (UER/RA) for 2000-08 was 7.9% (about 61 UER per team per year) while the rate for the 1930s was 14.4% (about 109 UER per team per year). Assuming that all the data necessary to produce Dewan's fielding metrics is available, would they show this? If they need something like batted-ball strength for calculation, then we will need to wait a while before there is sufficient data to make conclusions.

I may be misunderstanding your point, but there's no reason to assume that Dewan's metrics would confirm this, as it's entirely possible that average standards of range and sure-handedness have not evolved in parallel over the course of baseball history. It's also possible that the trend in UER/RA has resulted from a change in official scoring tendencies. Either way, it doesn't seem very important to the overall issue.

#6 StupendousMan

  • 380 posts

Posted 22 March 2009 - 01:43 PM

This discussion, together with the "What makes a good discussion?" thread, prompted me to write a little tutorial which tries to illustrate _how_ one might look for the true value of defense -- or some other facet of a team's play. It doesn't really answer the question being asked here, but I thought I'd mention it since it might help readers of this thread to attack the question themselves.

A simple example of correlation analysis

I look forward to reading more on this topic!

#7 Max Power


  • thai good. you like shirt?


  • 1,955 posts

Posted 22 March 2009 - 04:22 PM

Can we really reliably remove pitching from the equation? All the defense independent pitching stats I've seen assign HR allowed, BB, K, and HBP to the pitcher and everything else to the defense, which seems way over simplified. If this "discovery" is based on those assumptions, then I don't think it's worth all that much.

I know all the old Voros McCracken stuff about pitchers not being able to control hits on balls in play, but what about quality of contact? A pitcher who gives up more line drives will make things more difficult on his fielders and give up way more doubles and triples than one who only allows weak contact and gives up bloops and bleeders. It may be the same BABIP, but the former will have a higher SLG BIP and give up more runs with the same defense behind him. I suppose you could incorporate line drive percentage into the calculations, but that doesn't tell you if they were hit right at fielders. You could use video to try to figure that out, but then you're subjectively determining whether a fielder should have made a play. That leads right back to the original question of what's attributable to the pitcher versus the defense.

Everyone can agree that the game is 50% offense and 50% pitching and defense, except Joe Morgan and the like. Trying to split that pitching and defense half with any accuracy seems an exercise in futility considering how interlinked they are.

#8 Bellhorn


  • Lumiere


  • 1,520 posts

Posted 22 March 2009 - 07:02 PM

This discussion, together with the "What makes a good discussion?" thread, prompted me to write a little tutorial which tries to illustrate _how_ one might look for the true value of defense -- or some other facet of a team's play. It doesn't really answer the question being asked here, but I thought I'd mention it since it might help readers of this thread to attack the question themselves.

A simple example of correlation analysis

I look forward to reading more on this topic!

Great stuff. One question: why not correlate offensive game events with a team's runs scored (and likewise for defensive events/runs allowed) instead of with winning percentage? It seems pretty implausible that a team's offensive events have any impact on their RA totals, and by correlating with wins, we would seem to leave open the possibility that, by some coincidence, teams with higher than average (say) GIDP have been worse (or better) than average RA teams in those years. So this would distort the impact that the event actually has on the winning process.

Or are you confident that the sample of teams is large enough that this is not an issue?

#9 Bellhorn


  • Lumiere


  • 1,520 posts

Posted 22 March 2009 - 07:26 PM

Can we really reliably remove pitching from the equation? All the defense independent pitching stats I've seen assign HR allowed, BB, K, and HBP to the pitcher and everything else to the defense, which seems way over simplified. If this "discovery" is based on those assumptions, then I don't think it's worth all that much.

Fielding statistics have moved beyond this point - you should read a primer on UZR or John Dewan's work in The Fielding Bible, (or ask a SoSHer more knowledgeable than myself to give you a tutorial.) Basically, by looking at batted ball type (LD, GB, FB) and breaking the field down into zones used to track the location of batted balls, we can get a pretty good idea of how many plays fielders are making above or below average, and how many runs their efforts add to or take away from the team's RA total.

You're certainly correct, however, that until we reach a Hit F/X stage of stats (e.g. the ball traveled off the bat at angle x from the foul line, angle y from the horizontal, at velocity v) we may be crediting or penalizing certain fielders unfairly. There's also the issue of park effects, with Fenway's left field wall being the most notorious example. Ultimately, it wouldn't surprise me if as a result of the next wave of fielding stats, some of the runs currently credited to fielders end up being given back to other areas, either as credit to pitchers or as debit from the opposing team's hitters.

But the consensus right now seems to be that even if the best fielding statistics aren't perfect, they're pretty close, such that future deviations from the estimates they provide will probably be minor.

Everyone can agree that the game is 50% offense and 50% pitching and defense.

Actually, they don't need to be. As a wild example, let's say that advances in strength training mean that 20 years from now, every single position player hits like (what we now consider to be) a superstar, and that every team would score 1000 runs against an average pitching staff from 2009. Let's say that standards of pitching and defense also improve, but not in such a uniform manner, with the result that our future teams are normally distributed by RA tendencies - let's say that the average team allows 850 runs, while the best and worst allow 700 and 1000 respectively. Then you could, in a sense, say that the game is 100% pitching and defense, 0% offense, because a team's success or failure is entirely determined by how well they pitch and field.

Also, even if the talent levels are such that deviation in RS and RA comes equally from both sides of the ball, we can see from the pythagorean formula that a run prevented is slightly more valuable than a run scored (and pitching is more susceptible to being leveraged through bullpen use.) So even in that case, pitching/defense are slightly more than 50% of the game.

#10 Max Power


  • thai good. you like shirt?


  • 1,955 posts

Posted 22 March 2009 - 08:34 PM

Fielding statistics have moved beyond this point - you should read a primer on UZR or John Dewan's work in The Fielding Bible, (or ask a SoSHer more knowledgeable than myself to give you a tutorial.) Basically, by looking at batted ball type (LD, GB, FB) and breaking the field down into zones used to track the location of batted balls, we can get a pretty good idea of how many plays fielders are making above or below average, and how many runs their efforts add to or take away from the team's RA total.


I know that they've tried moving toward breaking down fielding stats into zones and batted ball types, but are those used in fielding independent pitching stats? I've looked it up and dERA, FIP, DICE, etc. still don't take all that UZR related data into account. Are my web searching skills failing me on that one? If this study added up UZR numbers rather than using pitching stats, I guess it could work, but I still think there's way too much subjectivity in the numbers. The batted ball types are still determined by guys sitting at the games and pusing a button if it's fly, line, ground and soft, medium, or hard.

But the consensus right now seems to be that even if the best fielding statistics aren't perfect, they're pretty close, such that future deviations from the estimates they provide will probably be minor.

It feels like UZR numbers are way too volitile to be a minor deviation from truth. Orlando Hudson has ranged from -9 to +16 UZR/150, A-Rod is -12 to +12, Jeter is +1 to -18, Jeff Kent is +8 to -22. That either means there's so much noise that it makes the yearly numbers much less meaningful, or the numbers are precise and defense is a highly variable skill. If it's the former, we still have a long way to go for real fielding stats. If it's the latter, it doesn't matter how good the stats are, it won't help GMs put together teams.

Actually, they don't need to be. As a wild example, let's say that advances in strength training mean that 20 years from now, every single position player hits like (what we now consider to be) a superstar, and that every team would score 1000 runs against an average pitching staff from 2009. Let's say that standards of pitching and defense also improve, but not in such a uniform manner, with the result that our future teams are normally distributed by RA tendencies - let's say that the average team allows 850 runs, while the best and worst allow 700 and 1000 respectively. Then you could, in a sense, say that the game is 100% pitching and defense, 0% offense, because a team's success or failure is entirely determined by how well they pitch and field.

Also, even if the talent levels are such that deviation in RS and RA comes equally from both sides of the ball, we can see from the pythagorean formula that a run prevented is slightly more valuable than a run scored (and pitching is more susceptible to being leveraged through bullpen use.) So even in that case, pitching/defense are slightly more than 50% of the game.


You're right on the second part about the pythagorean formula, dumb mistake on me. But the first part depends on every team having identical offensive ability, not their ability to score lots of runs. You'd get the exact opposite breakdown if you assumed every team was exactly as good at preventing runs as the others, but they had variable offensive ability. Then the game would be 0% pitching and defense and 100% offense, regardless of what you set the baseline to.

#11 Bellhorn


  • Lumiere


  • 1,520 posts

Posted 22 March 2009 - 10:16 PM

I know that they've tried moving toward breaking down fielding stats into zones and batted ball types, but are those used in fielding independent pitching stats? I've looked it up and dERA, FIP, DICE, etc. still don't take all that UZR related data into account. Are my web searching skills failing me on that one? If this study added up UZR numbers rather than using pitching stats, I guess it could work, but I still think there's way too much subjectivity in the numbers. The batted ball types are still determined by guys sitting at the games and pusing a button if it's fly, line, ground and soft, medium, or hard.

The bolded section is correct. They're added up independently of each other, nothing to do with DIPS type processes. As for the limitations of UZR data....well, yeah. They're not as perfect as offensive numbers, but again, most people seem to think they do a pretty good job giving a decent picture of reality. We may have to agree to disagree if you wish to maintain otherwise.

It feels like UZR numbers are way too volitile to be a minor deviation from truth. Orlando Hudson has ranged from -9 to +16 UZR/150, A-Rod is -12 to +12, Jeter is +1 to -18, Jeff Kent is +8 to -22. That either means there's so much noise that it makes the yearly numbers much less meaningful, or the numbers are precise and defense is a highly variable skill. If it's the former, we still have a long way to go for real fielding stats. If it's the latter, it doesn't matter how good the stats are, it won't help GMs put together teams.

Well, we need more than a few select cases to get a firm sense of the autocorrelation of the stat. Does anyone out there know what it is? If it is indeed significantly lower than those for the most stable stats, then sure, that would be a good reason to consider that the jury is still out when it comes to the original thesis of this thread. Though while the superior fielding stats might end up reapportioning some runs to pitching/hitting/baserunning, they might also confirm what UZR suggests, or suggest that defensive performance really is highly variable from year to year (though I can't see much reason why this should be the case.)

You'd get the exact opposite breakdown if you assumed every team was exactly as good at preventing runs as the others, but they had variable offensive ability. Then the game would be 0% pitching and defense and 100% offense, regardless of what you set the baseline to.

Yes, of course. The example was just designed to show that it doesn't have to be 50/50.

#12 Max Power


  • thai good. you like shirt?


  • 1,955 posts

Posted 23 March 2009 - 12:09 PM

Yes, of course. The example was just designed to show that it doesn't have to be 50/50.


This is kind of a theoretical statistical tangent, but how would you identify if teams were all equal in either offense or defense? A run scored by someone is a run given up by someone else, so you'd see a pretty much normal distrubution of runs scored and runs allowed even if every team were equally good at scoring runs.

Edit: I'm an idiot. If every team were equally good at allowing runs, their runs allowed would be league averge minus their runs scored over average. And that would be the exact same distribution if every team were equally good at scoring runs, so telling the two situations apart would be difficult.

Edited by Max Power, 23 March 2009 - 12:28 PM.


#13 Bellhorn


  • Lumiere


  • 1,520 posts

Posted 23 March 2009 - 11:29 PM

This is kind of a theoretical statistical tangent, but how would you identify if teams were all equal in either offense or defense? A run scored by someone is a run given up by someone else, so you'd see a pretty much normal distrubution of runs scored and runs allowed even if every team were equally good at scoring runs.

Edit: I'm an idiot. If every team were equally good at allowing runs, their runs allowed would be league averge minus their runs scored over average. And that would be the exact same distribution if every team were equally good at scoring runs, so telling the two situations apart would be difficult.

You're certainly not an idiot, but I want to follow up on this: I'm assuming in the discussion that there's an opponent-adjustment component to run totals before determining the percentage of each part of the game; e.g. BP's Adjusted EQR instead of just EQR. But to be fair, this wasn't necessarily involved in the figures from the THT annual. However, even without a rigorous opponent adjustment, you should get different distributions of run totals depending on whether the greater difference in ability lies on the offensive or defensive side of the ball: consider a simplified scenario where are 10 teams, and each team plays each other team once. Let's pretend that we somehow know that all teams have equal run scoring ability (let's call it 5 RSPG) but the run prevention abilities are divided into one 3 RAPG team, eight 5 RAPG teams, and one 7 RAPG team. What happens when they play each other? To simplify things, let's assume that the game outcome depends equally on the tendencies on either side of the ball. The 3 RAPG team will allow 4 runs in every game, the 5 RAPG teams will allow 5 runs in every game, and the 7 RAPG will allow 6 runs every game. On the other side, the 3 RAPG team will score 5 runs in eight of their games, 6 in the other one (5.11 runs per game), the 5 RAPG teams will score 4 runs once, 5 runs seven times, 6 runs once (5 runs per game), the 7 RAPG team will score 5 runs eight times, 4 runs once (4.89 runs per game.) If you reverse the assumptions so that the outliers are on the offensive side while every team prevents runs equally well, the distribution of runs scored/allowed per game will be reversed. Even though the distribution does not exactly match the Platonic run tendencies of the teams, unless I'm missing something, every distribution of outcomes should have a one-to-one correspondence with a particular distribution of run tendencies.

#14 Vermonter At Large


  • SoxFan


  • 3,055 posts

Posted 29 March 2009 - 09:03 PM

Honestly, what I think Dewan "discovered" is the contention I've been making all along that all defensive metrics overestimate runs allowed by about 50%, because there is no attribution to the pitcher.

What his discovery should have been, therefore, is that his metric double-counts fielding runs.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users