Lists of [Average - Excellent] ranges for advanced stats?

Bleedred

Member
SoSH Member
Feb 21, 2001
10,022
Boston, MA
I am not a sabremetrician, but a big reason why I love being a member of this board is reading sophisticated analyses as to Player X or Y.  I know there are sites which explain what it is that BABIP, xFIP, VORP, WAR, ERA+, etc. are, but would it be easy for one or some of you to make a list of average - excellent ranges with regard to all relevant statistics.
 
For example, if someone who has never seen a baseball game in his life asked me what an [average - stellar] batting average is, I'd say something like this:
 
Average:  250-260
Decent:  260-275
Good:  275-290
Very good:  290-305
Excellent:  305-320 
Stellar:  320+  
 
It would be great to be able to refer to a list like this for advanced stats.  i.e. Rather than researching what xFIP is (which I have no idea at the moment), looking at an xFIP list like the one above would be really useful.   
 

Jnai

is not worried about sex with goats
SoSH Member
Sep 15, 2007
16,144
<null>
xFIP is like FIP, except it uses league average HR/FB rate.
 
Fangraphs has pages like the ones you are asking for:
http://www.fangraphs.com/library/pitching/xfip/
 
If you want a simple way* to do this, you can use some simple excel and stats:
1. Take your list of numbers.
2. Find the average. (AVERAGE).
3. Find the standard deviation (STDEV).
 
 
Great (AVERAGE + STDEV*2)
Good (AVERAGE + STDEV)
Average (AVERAGE)
Poor (AVERAGE - STDEV)
Abysmal (Average - STDEV*2)
 
For 2014 BA (qualifiers), it looks like:
Great 0.328
Good 0.300
Average 0.272
Poor 0.244
Abysmal 0.216
 
 
For 2014 BsR, for example, it looks like:
Great 7.84
Good 4.08
Average 0.32
Poor -3.43
Abysmal -7.19
 
 
Etc.
 
 
*Yes, I know, it's just a simple way.
 

Merkle's Boner

Member
SoSH Member
Apr 24, 2011
3,826
I think this is a great idea.  Although I'm not sure Jnai's response cleared much up for me. What's BsR?
 

Jnai

is not worried about sex with goats
SoSH Member
Sep 15, 2007
16,144
<null>
Oh. I just took some random stat from the fangraphs leaderboard. BsR is their baserunning stat.
 
For qualified OBP this year, it's:
Great 0.406
Good 0.372
Average 0.338
Poor 0.304
Abysmal 0.270
 

seantoo

toots his own horn award winner
Jul 16, 2005
1,308
Southern NH, from Watertown, MA
Jnai said:
Oh. I just took some random stat from the fangraphs leaderboard. BsR is their baserunning stat.
 
For qualified OBP this year, it's:
Great 0.406
Good 0.372
Average 0.338
Poor 0.304
Abysmal 0.270
Have to quibble with this, anything above average enters the good range, anything at .372 OBP is very very good, in fact only 23 players are at or better than that level, which would make that level great. There's about 750 players which would put anyone at .372 OBP in the top 3%. Above .406 obp and there are 3 players which would be .004 percent. Your standard for measurement is off. Also the average OBP this season is all the way down to .316.
 
Most season's an average OBP has been in the .330 -.333 range and like any distribution of stats most offensive categories follow a bell shaped curve, therefore equally spacing the 'levels' of performance is not the way to go about this. Start over.
 

StupendousMan

Member
SoSH Member
Jul 20, 2005
1,925
Okay, fine, wise-guy.  How about quintiles?   Sort the values numerically, then divide them into 5 equal groups: the top 20%, the next 20%, etc.  In the case of Batting Average, in 2014, for qualified batters, this gives us the following five groups to go into Jnai's categories (which strike me as a good set to use).
Code:
Great (top 20%):         0.292 to 0.340
Good (next 20%):         0.277 to 0.291
Average (middle 20%):    0.264 to 0.276
Poor (next 20%):         0.241 to 0.275
Abysmal (bottom 20%):    0.199 to 0.240
 

ALiveH

Member
SoSH Member
Apr 23, 2010
1,104
i guess i just learned that only 1 red sox player qualifies as Good or Great (Brock Holt) on BAvg which is depressing.  Everyone else is Average, Poor or Abysmal.
 

Bleedred

Member
SoSH Member
Feb 21, 2001
10,022
Boston, MA
Jnai said:
xFIP is like FIP, except it uses league average HR/FB rate.
 
Fangraphs has pages like the ones you are asking for:
http://www.fangraphs.com/library/pitching/xfip/
 
If you want a simple way* to do this, you can use some simple excel and stats:
1. Take your list of numbers.
2. Find the average. (AVERAGE).
3. Find the standard deviation (STDEV).
 
 
Great (AVERAGE + STDEV*2)
Good (AVERAGE + STDEV)
Average (AVERAGE)
Poor (AVERAGE - STDEV)
Abysmal (Average - STDEV*2)
 
For 2014 BA (qualifiers), it looks like:
Great 0.328
Good 0.300
Average 0.272
Poor 0.244
Abysmal 0.216
 
 
For 2014 BsR, for example, it looks like:
Great 7.84
Good 4.08
Average 0.32
Poor -3.43
Abysmal -7.19
 
 
Etc.
 
 
*Yes, I know, it's just a simple way.
I appreciate the effort, but this a DOES NOT compute to my simple mind.   The follow on posts are more useful.  
 

Jnai

is not worried about sex with goats
SoSH Member
Sep 15, 2007
16,144
<null>
seantoo said:
Have to quibble with this, anything above average enters the good range, anything at .372 OBP is very very good, in fact only 23 players are at or better than that level, which would make that level great. There's about 750 players which would put anyone at .372 OBP in the top 3%. Above .406 obp and there are 3 players which would be .004 percent. Your standard for measurement is off. Also the average OBP this season is all the way down to .316.
 
Most season's an average OBP has been in the .330 -.333 range and like any distribution of stats most offensive categories follow a bell shaped curve, therefore equally spacing the 'levels' of performance is not the way to go about this. Start over.
 
Start over with a bell curve? You do realize that I posted the formula to a fucking bell curve, right?
 
Also, the post pretty clearly said that it used the qualifiers. I was just trying to demonstrate how, with some simple math, he could get to where he wanted to be for whatever stat he was interested in.
 

Jnai

is not worried about sex with goats
SoSH Member
Sep 15, 2007
16,144
<null>
Bleedred said:
I appreciate the effort, but this a DOES NOT compute to my simple mind.   The follow on posts are more useful.  
 
Does not compute because you can't understand "average" and "standard deviation", or does not compute because you don't know how to compute them, or does not compute because you don't believe in normal distributions?
 

StupendousMan

Member
SoSH Member
Jul 20, 2005
1,925
For those who are not fluent in the mathematics of statistics, let me add a bit to the rather terse exchange between Jnai and seantoo. 
 
There are many, many instances in which measurements of some quantity -- the mass of human infants at birth, the size of sand particles on a beach, the batting average of baseball players -- has a distinctive distribution: a bell-shaped curve, with a peak and symmetric dropoffs on each side.   There are good reasons why distributions of this sort occur so frequently, but I'd prefer not to get into that now.  In any case, one can describe these distributions using a rather simple mathematical model which can be written as

In this equation, the center of the distribution is the value "x0", and the width of the bell-shaped curve can be described by the value of "sigma".   It turns out that if the distribution is perfectly bell-shaped (in mathematical terms, is "normal"), then
 
  a) half the value will lie above the center x0, and half below
  b) 1/6 of the values will be greater than (x0 + sigma).  This is often called "one standard deviation above the mean."
  c) 1/6 of the values will be less than (x0 - sigma).  This is often called "one standard deviation below the mean."
  d) only about 5 percent of the values will be larger than (x0 + 2*sigma), or "two standard deviations above the mean."
 
So, Jnai's suggestion was that we use this mathematical model and its parameters to divide baseball players.  Under his suggested method, only about 5 percent of the players would fall into the "Great" category, and about 11 percent would fall into the "Good" category.   My suggestion, to use quintiles, would make it much easier to be called "Great", since 20 percent of all players would fall into that category.
 
There are lots and lots and LOTS of statistical tools which are designed to work with gaussian ("bell-shaped") distributions, so many stat-lovers will follow Jnai's idea.  It's very quick and easy to extract a big list of player statistics and extract the parameters of the best-fitting gaussian distribution.
 

Frisbetarian

♫ ♫ ♫ ♫ ♫ ♫
Moderator
SoSH Member
Dec 3, 2003
5,273
Off the beaten track
Maybe we need a SoSH math primer! Standard deviation is explained simply here. If you had 5 hitters who hit 15, 20, 21, 24, and 30 home runs, the average number of home runs would be 22. The standard deviation would be ((15-22)^2 + (20-22)^2 + (21-22)^2 + (24-22)^2 + (30=22)^2 )/5 =  5.5. In this very small sample, we have one batter less than one SD below average and one better that 1 SD above. These, in Jnai's very good (and pretty simple) post are good and poor, the rest are average. In a larger sample, there are larger variations from the mean and players can be over 2 SD (2*5.5 in the example I gave) above or below average. Get it?
 
Using Jnai's/Fangraphs formula makes perfect sense. 
 

Jnai

is not worried about sex with goats
SoSH Member
Sep 15, 2007
16,144
<null>
StupendousMan said:
For those who are not fluent in the mathematics of statistics, let me add a bit to the rather terse exchange between Jnai and seantoo.
 
There are many, many instances in which measurements of some quantity -- the mass of human infants at birth, the size of sand particles on a beach, the batting average of baseball players -- has a distinctive distribution: a bell-shaped curve, with a peak and symmetric dropoffs on each side.   There are good reasons why distributions of this sort occur so frequently, but I'd prefer not to get into that now.  In any case, one can describe these distributions using a rather simple mathematical model which can be written as

In this equation, the center of the distribution is the value "x0", and the width of the bell-shaped curve can be described by the value of "sigma".   It turns out that if the distribution is perfectly bell-shaped (in mathematical terms, is "normal"), then
 
  a) half the value will lie above the center x0, and half below
  b) 1/6 of the values will be greater than (x0 + sigma).  This is often called "one standard deviation above the mean."
  c) 1/6 of the values will be less than (x0 - sigma).  This is often called "one standard deviation below the mean."
  d) only about 5 percent of the values will be larger than (x0 + 2*sigma), or "two standard deviations above the mean."
 
So, Jnai's suggestion was that we use this mathematical model and its parameters to divide baseball players.  Under his suggested method, only about 5 percent of the players would fall into the "Great" category, and about 11 percent would fall into the "Good" category.   My suggestion, to use quintiles, would make it much easier to be called "Great", since 20 percent of all players would fall into that category.
 
There are lots and lots and LOTS of statistical tools which are designed to work with gaussian ("bell-shaped") distributions, so many stat-lovers will follow Jnai's idea.  It's very quick and easy to extract a big list of player statistics and extract the parameters of the best-fitting gaussian distribution.
 
Quintiles are a fine suggestion too. The problem with quintiles is that they assume there are equally as many "great" players as there are "average" players, which seems intuitively wrong to me, and I think also to most scouts. You hear scouts that very rarely give a 70 and almost never give an 80 (on the 20-80 scale), which fits with the 50+10s model of talent distribution.
 
The problem with the values I gave is probably that they should have included the values around each point, so -.5s to .5s would have been average, .5s to 1.5s would have been good, etc.
 

Frisbetarian

♫ ♫ ♫ ♫ ♫ ♫
Moderator
SoSH Member
Dec 3, 2003
5,273
Off the beaten track
Jnai said:
 
Quintiles are a fine suggestion too. The problem with quintiles is that they assume there are equally as many "great" players as there are "average" players, which seems intuitively wrong to me, and I think also to most scouts. You hear scouts that very rarely give a 70 and almost never give an 80 (on the 20-80 scale), which fits with the 50+10s model of talent distribution.
 
The problem with the values I gave is probably that they should have included the values around each point, so -.5s to .5s would have been average, .5s to 1.5s would have been good, etc.
 
I agree with the bolded, and it may make seantoo feel better!
 

EricFeczko

Member
SoSH Member
Apr 26, 2014
4,851
StupendousMan said:
For those who are not fluent in the mathematics of statistics, let me add a bit to the rather terse exchange between Jnai and seantoo.
 
There are many, many instances in which measurements of some quantity -- the mass of human infants at birth, the size of sand particles on a beach, the batting average of baseball players -- has a distinctive distribution: a bell-shaped curve, with a peak and symmetric dropoffs on each side.   There are good reasons why distributions of this sort occur so frequently, but I'd prefer not to get into that now.  In any case, one can describe these distributions using a rather simple mathematical model which can be written as

In this equation, the center of the distribution is the value "x0", and the width of the bell-shaped curve can be described by the value of "sigma".   It turns out that if the distribution is perfectly bell-shaped (in mathematical terms, is "normal"), then
 
  a) half the value will lie above the center x0, and half below
  b) 1/6 of the values will be greater than (x0 + sigma).  This is often called "one standard deviation above the mean."
  c) 1/6 of the values will be less than (x0 - sigma).  This is often called "one standard deviation below the mean."
  d) only about 5 percent of the values will be larger than (x0 + 2*sigma), or "two standard deviations above the mean."
 
So, Jnai's suggestion was that we use this mathematical model and its parameters to divide baseball players.  Under his suggested method, only about 5 percent of the players would fall into the "Great" category, and about 11 percent would fall into the "Good" category.   My suggestion, to use quintiles, would make it much easier to be called "Great", since 20 percent of all players would fall into that category.
 
There are lots and lots and LOTS of statistical tools which are designed to work with gaussian ("bell-shaped") distributions, so many stat-lovers will follow Jnai's idea.  It's very quick and easy to extract a big list of player statistics and extract the parameters of the best-fitting gaussian distribution.
Not to throw a monkey in your wrench, but the mathematical model you posited does not technically fit most of the data you describe.
Don't get me wrong, many things that occur can be fit with a Gaussian model quite nicely. However, in the model you posted, the values extend infinitely in both directions from the peak of the bell-curve. In other words, a "normal" distribution occurs when the values along the distribution can be any real number. However, we already know that many measures are finite. For example, one cannot have a negative batting average or a batting average greater than 1.0000.
One can make the distribution normal by transforming the measure. In the case of batting average in 2014, you can determine the mean and standard deviation for batting average in the MLB. For each batter, you subtract the mean from the hitter's batting average and divide the result by the standard deviation. The new values are derived from a distribution that IS normal, which enables all the parametric stats you are describing.

However, there's a much simpler way to do all of this. If what one cares about is ranking stats in a given year, then there's no need to use a model, because we are measuring the entire population. Simply declare what cutoffs you want (e.g. great = top 5 percent, abysmal = bottom 5 percent), and find the thresholds for batting average that reflect those cutoffs.

If what one cares about is ranking stats in a vacuum, on the other hand (i.e. in general, what's a great batting average), then these methods will not suffice. One will need to use resampling methods in order to estimate those ranks.

 
 

cannonball 1729

Member
SoSH Member
Sep 8, 2005
3,578
The Sticks
Jnai said:
 
For qualified OBP this year, it's:
Great 0.406
Good 0.372
Average 0.338
Poor 0.304
Abysmal 0.270
 
 
My favorite way of thinking of OBP is this:
 
1.) Take OBP
2.) Subtract 70 points
3.) Use your regular Batting Average scale.
 
So a .270 OBP is like having a .200 batting average.  And a .400 OBP is like a .330 batting average.  
 
70 points is usually close to the average difference between BA and OBP, so it's a good number to use.  It checks out empirically, too - if you take Jnai's OBP chart and subtract 70, you basically get his BA chart.
 

Reverend

for king and country
Lifetime Member
SoSH Member
Jan 20, 2007
64,533
Bleedred said:
Nevermind.   I'm sorry I asked.
Hey now, buck up 'lil pardner--jnai have you a simple answer in there:

Go to FanGraphs and click on their glossary (top right). They have the breakdowns for each stat, just as you asked for.
 

Pilgrim

Member
SoSH Member
Mar 24, 2006
2,407
Jamaica Plain
Also on Fangraphs, if you go to a player page, click the graphs tab, and pick a tab, there will be big sidebars labeled GOOD and POOR.

If you're just eyeballing a stat you don't have a feel for, that should work.
 

Hank Scorpio

Member
SoSH Member
Apr 1, 2013
6,996
Salem, NH
As far as OBP, SLG and OPS goes, is there a scale which values one against the other? 
 
For example, maybe .400 worth of OBP is worth .475 of SLG.
 
I suspect they're not equal (.350 OBP pretty good, .350 SLG pretty awful), and the weighted value of each varies from year to year, depending on the nature of the league's overall offense.
 
We have OPS+, but why not SLG+ and OBP+? 
 
Given a player with a .900 OPS, is there an ideal "split" between OBP and OPS? (For example, a .400/.500 player is valued differently than a .450/.450 player... something unrealistic like .700/.200 is probably ideal, but probably not on the 2014 Red Sox)
 

doctorogres

New Member
Aug 27, 2010
117
wOBA does this, factoring in the relative run values of AVG, OBP, and SLG to create a weighted stat that that shows a hitter's total value on a OBP scale (so .400 is great, .330 is average, etc.). For the relative values you can look up the formula on Fangraphs, but OBP is something like 1.8x as valuable than SLG. I don't think this stat is adjusted for park and era, though.