edX Sabermetrics 101 Discussion

absintheofmalaise

too many flowers
Dope
SoSH Member
Mar 16, 2005
23,831
The gran facenda
Thought I'd go ahead and start this over here for the course, which started today. 
Here are the people who indicated they have signed up for the course in the thread on the main board. Please let me know if I missed anyone.
me
Supermanny
Ferm Sheller + son
JGray38
Just a bit outside
rundugrun
dbn
wibi
Stitch01
dylanmarsh
JMOH
gtg807y
terrisus
uncannymanny
norm from cheers
mascho
PaulinMyrBch + son
adam42381
Rice4HOF
inJacobyWetrust
metaprosthesis
GeorgeThomas
OttoC
cjdmadcow
Don Bradman
fuzzy_one + son
HurstSoGood
brs3
MyDaughterLovesTomGordon
tbrown_01923
Scott Cooper's Grand Slam
MuzzyField
mpx42
trotsplits
StupendousMan
Laser Show
RIFan
CallYaz
weeba
richgedman'sghost
pokey_reese
Comfortably Numb
TheGoldenGreek33
hikeeba
SoxLegacy
NoLastCall125
PAB1353
ivanamp
ganador2004
NWsoxophile
Busraker
 

absintheofmalaise

too many flowers
Dope
SoSH Member
Mar 16, 2005
23,831
The gran facenda
I was planning on going over the coursework this weekend and I wanted to see if others were planning on doing the same. It's tough for me during the week and I'd guess it is probably the same for many of us.
 

StupendousMan

Member
SoSH Member
Jul 20, 2005
1,925
One of the first assignments involves playing with SQL queries into a database.  The version of the MLB stats database we're using can be downloaded (for free) from the course site, or from Sean Lahman's site directly:
 
http://www.seanlahman.com/baseball-archive/statistics/
 
Learning about this has already paid back the cost of registering to audit ... heh.
 

Ferm Sheller

Member
SoSH Member
Mar 5, 2007
20,940
absintheofmalaise said:
I was planning on going over the coursework this weekend and I wanted to see if others were planning on doing the same. It's tough for me during the week and I'd guess it is probably the same for many of us.
 
I'm probably doing the same because, as you said, it's too hard to find time during the week.  One of the nice things about on line courses are that you participate at times that are good for you. 
 
Also, we can see how it goes, but we may wish to break this out into sub-topics, especially as the weeks go by.  Just a thought.
 

absintheofmalaise

too many flowers
Dope
SoSH Member
Mar 16, 2005
23,831
The gran facenda
I agree about the sub-topics. We are at just under 40 people so far. That's not counting rev* and others that didn't post in either thread.
 
*rev is probably out trying to find a copy of Porky's or Old Yeller to rent right now.
 

Scott Cooper's Grand Slam

Member
SoSH Member
Jul 12, 2008
4,321
New England
absintheofmalaise said:
I was planning on going over the coursework this weekend and I wanted to see if others were planning on doing the same. It's tough for me during the week and I'd guess it is probably the same for many of us.
 
 
Ferm Sheller said:
 
I'm probably doing the same because, as you said, it's too hard to find time during the week.  One of the nice things about on line courses are that you participate at times that are good for you. 
 
Also, we can see how it goes, but we may wish to break this out into sub-topics, especially as the weeks go by.  Just a thought.
 
My laptop's been at the Genius Bar all week (broken microphone housing on a 2012 MacBook Air), but I just got the call to come pick it up. I'll attempt to get everything loaded and start working this weekend.
 

absintheofmalaise

too many flowers
Dope
SoSH Member
Mar 16, 2005
23,831
The gran facenda
Jeff Passan has an article out today on this course.  
 
 
Over the last month, since registration opened for the free online course Sabermetrics 101, a 13,464-person army of the curious, the dreamers, the scholarly and all other kinds signed up and turned a clever idea into a rousing success. The course launched Thursday morning. Nerdery went massively mainstream. The world did not end.
Actually, it got better, much better, and not simply because the brains behind the course, Boston-based professor Andy Andres, has put together a curriculum that explains the importance of baseball analytics, their history and even offers a tool through which students themselves can code projects. SABR101x, as it's called, is the latest signpost of the ongoing employment revolution in baseball, one that stretches from analytics to scouting and will continue to upend the makeup of front offices across the sport.
 

richgedman'sghost

Well-Known Member
Lifetime Member
SoSH Member
May 13, 2006
1,890
ct
absintheofmalaise said:
Thought I'd go ahead and start this over here for the course, which started today. 
Here are the people who indicated they have signed up for the course in the thread on the main board. Please let me know if I missed anyone.
me
Supermanny
Ferm Sheller + son
JGray38
Just a bit outside
rundugrun
dbn
wibi
Stitch01
dylanmarsh
JMOH
gtg807y
terrisus
uncannymanny
norm from cheers
mascho
PaulinMyrBch + son
adam42381
Rice4HOF
inJacobyWetrust
metaprosthesis
GeorgeThomas
OttoC
cjdmadcow
Don Bradman
fuzzy_one + son
HurstSoGood
brs3
MyDaughterLovesTomGordon
tbrown_01923
Scott Cooper's Grand Slam
MuzzyField
mpx42
trotsplits
StupendousMan
Laser Show
RIFan
CallYaz
weeba
I registered for the course and will attempt to audit it. Hopefully the math does not overwhelm me.
 

hikeeba

Member
SoSH Member
Dec 8, 2005
983
Signed up to audit. After going through this week's stuff, what else is there to do?
 

SoxLegacy

New Member
Oct 30, 2008
629
Maryland
I signed up when it was announced here and am looking forward to it. Like many of you mentioned,  I will be looking over the course materials this weekend.
 

absintheofmalaise

too many flowers
Dope
SoSH Member
Mar 16, 2005
23,831
The gran facenda
I went through the first module over the weekend. After playing some in the SQL Sandbox I'm really looking forward to exploring the Lahman db more. Need to get better on the SQL query syntax though.
 

MuzzyField

Well-Known Member
Gold Supporter
SoSH Member
I'm up to the history segment in module 1.  So far, very interesting.  I did waste some time trying to use mySQL directly on my mac with the downloadable Lahman database.  For now, I'll just stay in the sandbox and revisit this when I'm more comfortable with the formatting.   
 

Laser Show

Member
SoSH Member
Nov 7, 2008
5,096
Also up to the history segment (though I skipped ahead and watched the Keri and Cameron interviews). Really cool stuff so far, I see myself wasting a lot of time in the sandbox.
 

BusRaker

Member
SoSH Member
Aug 11, 2006
2,379
I'm so happy I can finally put my 15 years of SQL programming to a good use!
 

hikeeba

Member
SoSH Member
Dec 8, 2005
983
Up to Week 3 Problem Set 5-2 and they start using w.yearID and then a w at the end of a from. Someone asked the question of what was going on and the response was it was an 'alias  for a sub-select' that made things clearer.
Did I miss that somewhere? It's still not clear.
 

wibi

Member
SoSH Member
Jul 15, 2005
11,848
hikeeba said:
Up to Week 3 Problem Set 5-2 and they start using w.yearID and then a w at the end of a from. Someone asked the question of what was going on and the response was it was an 'alias  for a sub-select' that made things clearer.
Did I miss that somewhere? It's still not clear.
 
It wasnt explained in the video but essentially Andy is naming his table created by the nested select w
 
If he had named it Fenway {ie  ") Fenway" or ") AS Fenway" }then your would be calling SELECT Fenway.yearID at the start
 

wibi

Member
SoSH Member
Jul 15, 2005
11,848
Im running into an SQL query on the same problem set that I could use some help on.
 
I can create what he wants with AVG, MAX, MIN and STD but I cant get it to group by the STD. 
 
My code and the rest of my comment  is spoilered
 
EDIT:  Got it solved ... Solution is in the spoiler text too
 
This returns the proper columns but sorted by year.  I cant get it to sort by the STD as the problem requests and if I move the STD calculations inside the nested SELECT then I end up with a single line turn that is the yearID.  Any suggestions as to what I'm doing wrong?
 
SELECT w.yearID,
AVG(w.Error) AS AverageError,
MAX(w.Error) AS MAXError,
MIN(w.Error) AS MINError,
STD(w.Error) AS STDError
FROM (
      SELECT teamID, yearID, W,
      G*(R^2)/(R^2+ RA^2) AS predictedW,
      (G*(R^2)/(R^2+ RA^2)-W) AS Error
      FROM Teams
      WHERE yearID >= 1955
      ) w
GROUP BY w.yearID
 

Add an
ORDER BY STDError DESC
at the end of the query
 
 

hikeeba

Member
SoSH Member
Dec 8, 2005
983
Thanks wibi. I did see it in the following videos, just wish it had been covered beforehand. Times like that I miss a textbook.