Chat with John Dewan

QUOTE (Jnai @ Jul 13 2009, 01:09 PM) <{POST_SNAPBACK}>
Trying to get in before the deadline, so two questions:

1) does BIS* compute things like inter-rater reliability for all of their personnel? Are unreliable people discarded or weighted appropriately? I wonder about the effectiveness of using human judgments for many of these measures.
2) should defensive stats report both an average value and also a measure of variability? So, for example, players that have fewer (or greater) plays would have smaller or larger variability scores? The point is: could defensive evals do a better job communicating the amount of variability in each player (or team's) measurement, so that fans could get a good sense of whether or not a difference is really a significant difference?

*Question originally said STATS, not BIS (apologies)

I can only assume STATS is using the same process they used when I left. At BIS, we rigorously train our scorers and minimize any potential biases. During the season, we review each scorer's performance and make corrections as necessary. At the end of each season, we do a review of many plays to ensure that our data is recorded as accurately as possible.

The key with any statistic or number is the context. You can do your best to inform your readers of the process and thought behind each evaluation, and that's all you can do. We could attach a reliability score to every number we publish, but what about when people start misinterpreting the reliability indicator? Then we just have another statistic to explain. There's a fine line between educating readers and being too technical and losing their attention altogether.

QUOTE (maufman @ Jul 13 2009, 01:15 PM) <{POST_SNAPBACK}>
What kinds of factors skew statistical analyses of defense?

I'll give a concrete example: Jacoby Ellsbury's defense has fallen off a cliff this year, according to most advanced metrics. I find it hard to believe that he was an elite defender in 2008 and is now a poor one. Can you make an educated guess whether one year's rating is more likely to be an aberration than the other? If so, what factors would you look for as signs that a particular player's rating is an aberration (or, inversely, is especially likely to be accurate)?

The biggest thing that skews statistical analysis is sample size.

I mentioned Ellsbury earlier. With barely a full season of innings in the outfield, it's still early to draw strong conclusions. Based on what we've seen so far, Ellsbury has a below average arm with above average range, balancing out to an average centerfielder defensively, maybe above average at the corner positions. In 2008, he was tremendous in having very few Defensive Misplays relative to the Good Plays that we scouted and counted.

QUOTE (Bellhorn @ Jul 13 2009, 01:22 PM) <{POST_SNAPBACK}>
What kind of year-to-year correlation do we currently see in the best defensive metrics? And do you think it's fair to assume that if/when we have a way to perfectly measure defensive performance, it will show a similar year-to-year correlation as the most stable hitting/pitching metrics?

On the team level, we're seeing year-to-year correlations in Runs Saved in the .3 to .4 range. We're not quite to the level of hitting/pitching metrics yet, but we're getting closer. There is no perfect way to measure defense, and the same goes for offense and pitching. We will keep improving our methods, and eventually our understanding of defense will catch up to our understanding of hitting and pitching.

This is the last question -- thank you to everyone for your very detailed questions. I am very happy to see the level of sophistication and understanding that you all have. Enjoy the rest of the season!

Thank you so much for the incredible responses, John. This was so much more than we had hoped for.

I would like to open the discussion again now. John's busy schedule may preclude him from responding to any follow-up questions, but feel free to ask and discuss and thank the guy for all the time and effort he put into these answers.

Yes, great job by John!