Unnamed MLB team purchases Cray supercomputer

MakMan44

stole corsi's dream
SoSH Member
Aug 22, 2009
19,363
mpx42 said:
Guessing it's the Marlins, and the state of Florida is somehow paying for it.
No way. If this was true you think Loria wouldn't be bragging about it at every chance he could?
 

Section30

Member
SoSH Member
Aug 2, 2010
1,260
Portland OR
We need to use the Freedom of Information Act to force the NSA to tell us who it is.
 
Don't you have to get special permission from the government to buy a supercomputer? If yes, then someone will leak the info in short order.
 

67WasBest

Concierge
SoSH Member
Mar 17, 2004
2,442
Music City USA
My guess is the Rays.  They seem to have been experimenting with disruptive concepts and technologies for a while now and this would fit their model,  I can see Maddon buying off on this, if not outright lobbying for it.  Whoever it is, I'm sure the system is arriving somewhat ready, and there will be a refinement process over these next few months and when they launch it, we'll know.
 

soxhop411

news aggravator
SoSH Member
Dec 4, 2009
46,276
67WasBest said:
My guess is the Rays.  They seem to have been experiementing with disruptive concepts and technologies for a while now and this would fit their model,  I can see Maddon buying off on this, if not outright lobbying for it.  Whoever it is, I'm sure the system is arriving somewhat ready, and there will be a refinement process over these next few months and when they launch it, we'll know.
it says a team we least expect to use this type of data… So I am almost sure its not the Rays
 

VORP Speed

Member
SoSH Member
Apr 23, 2010
6,633
Ground Zero
soxhop411 said:
it says a team we least expect to use this type of data… So I am almost sure its not the Rays
That's coming from the CEO of Cray, though. If he's a very knowledgeable baseball fan, then that interpretation makes sense. If not, you could see someone thinking it unexpected that some little team from Tampa would be the first to buy a supercomputer and not the Yankees or Sox or Dodgers.
 

edoug

Member
SoSH Member
Jul 15, 2005
6,007
 
 
The team, which declines to be named, exemplifies an organisation that, five years ago, most people would not have dreamed would need, or even want, a supercomputer, he says.
Yankees?
 

Rovin Romine

Johnny Rico
Lifetime Member
SoSH Member
Jul 14, 2005
23,690
Miami (oh, Miami!)
MakMan44 said:
No way. If this was true you think Loria wouldn't be bragging about it at every chance he could?
He'd trade it for a bunch of laptops.  
 
**
 
That quote means nothing to me.  The mystery team could be so successful they wouldn't need or want it, so awful they wouldn't need or want it, or so like every other team they wouldn't need or want it.  
 

terrisus

formerly: imgran
SoSH Member
SemperFidelisSox said:
When does statistical analysis and technology cross the line into cheating? I feel like we're getting there.
 
Statistical analysis: Never
Technology: Well, a camera in centerfield sending a team signs is already designated as cheating. So, I'd say that's already been well marked off.
 
Also, while the article just gives a general "Cray Inc., which sells machines that range from $500,000 to $60 million," even at the upper end of that, $60 million isn't too crazy for an MLB team to be spending. The thing would be having things to run on it which would make it worthwhile.
 

SumnerH

Malt Liquor Picker
Dope
SoSH Member
Jul 18, 2005
31,893
Alexandria, VA
Section30 said:
We need to use the Freedom of Information Act to force the NSA to tell us who it is.
 
Don't you have to get special permission from the government to buy a supercomputer? If yes, then someone will leak the info in short order.
 
Nope.  There are export restrictions on supercomputers, under the Wassenaar agreement, EAR, and ITAR, but they're extremely loose these days and don't apply to domestic buyers.
 

JMDurron

Member
SoSH Member
Jul 15, 2005
5,127
The 5 year timeframe, and the emphasis on "need" makes me think Yankees.  2008-2009, the Yankees were in still "screw it, just throw cash around for everybody at the top of the FA rankings!" mode, and by the end of 2009 seemed just fine with that approach. 
 
I'm probably ready too much into it, though.  
 

derekson

Member
SoSH Member
Jun 26, 2010
6,224
It's gotta be the Astros. They've been hiring data analysis guys from all kinds of baseball websites and Luhnow is as progressive as anyone.
 
Plus the 5 year comment makes sense too: 5 years ago the Astros' front office was GM Ed Wade and a bunch of old guard baseball men.
 

Fred not Lynn

Dick Button Jr.
SoSH Member
Jul 13, 2005
5,253
Alberta
SemperFidelisSox said:
When does statistical analysis and technology cross the line into cheating? I feel like we're getting there.
As long as all teams have access to the technology, I think it is fair enough.

Of course with there being export restrictions on this equipment, Toronto may have something to say about it.
 

SumnerH

Malt Liquor Picker
Dope
SoSH Member
Jul 18, 2005
31,893
Alexandria, VA
Fred not Lynn said:
As long as all teams have access to the technology, I think it is fair enough.

Of course with there being export restrictions on this equipment, Toronto may have something to say about it.
The export stuff is basically nothing except with respect to Iran, North Korea, and a handful of others. Canada is a tier 1 country on the list and can pretty much buy whatever they want as long as they don't use it for nuclear research or reexport it to a controlled country. The Jays wouldn't have a problem buying big iron.
 

AlNipper49

Huge Member
Dope
SoSH Member
Apr 3, 2001
44,855
Mtigawi
Given the data we're dealing with I think most data they're dealing with using traditional methods would be ok on regular commodity gear. Cray to me suggests they want this data quickly, which leads me to believe that they'll be using it to analyze potential in-game adjustments versus using it for good old quantitative analysis
 

Fred not Lynn

Dick Button Jr.
SoSH Member
Jul 13, 2005
5,253
Alberta
I know some teams have made advances in this area, but what are the chances that the team that bought the million dollar computer still trots out a couple buckets of KFC and a warm 2 liter bottle of Coke post game for the minor leaguers in low A they just gave enormous signing bonuses to?
 

Toe Nash

Member
SoSH Member
Jul 28, 2005
5,599
02130
Fred not Lynn said:
I know some teams have made advances in this area, but what are the chances that the team that bought the million dollar computer still trots out a couple buckets of KFC and a warm 2 liter bottle of Coke post game for the minor leaguers in low A they just gave enormous signing bonuses to?
Or pays its human statistical analysts barely minimum wage...
 

crystalline

Member
SoSH Member
Oct 12, 2009
5,771
JP
AlNipper49 said:
Given the data we're dealing with I think most data they're dealing with using traditional methods would be ok on regular commodity gear. Cray to me suggests they want this data quickly, which leads me to believe that they'll be using it to analyze potential in-game adjustments versus using it for good old quantitative analysis
 
Could be, but supercomputers might be needed if they merely have a lot of long-running jobs.  Could just be that they have several big numerical problems and it's more efficient to buy a packaged cluster instead of building their own cluster.  98% of jobs on the big machines I've used are distributed, with each job runtime in 10's of minutes to hundreds of hours.  
 
Though as you say they could be trying to do Google-style distributed computations and get answers in milliseconds-to-seconds.   What kinds of problems would you actually want to do in baseball with that response time, though?  Perhaps motion analysis of pitcher mechanics to predict positioning?  I suppose they might have a motion-tracking camera on opposition pitchers with a model that predicts where the ball will go as pitchers tire, based on numerical analysis of multi-body-part kinematics.  It would have to be something that is dependent on variations in players' mechanics on that day and that pitch, however – if the model was based on variation in mechanics between pitchers and not on variation in the same pitcher across time, they could run the job to do that model estimation over days.
 

Rough Carrigan

reasons within Reason
Lifetime Member
SoSH Member
JMDurron said:
The 5 year timeframe, and the emphasis on "need" makes me think Yankees.  2008-2009, the Yankees were in still "screw it, just throw cash around for everybody at the top of the FA rankings!" mode, and by the end of 2009 seemed just fine with that approach. 
 
I'm probably ready too much into it, though.  
Yeah.  The yankees buy equipment from the Krupps, not Cray.
 

Reverend

for king and country
Lifetime Member
SoSH Member
Jan 20, 2007
64,034
JMDurron said:
The 5 year timeframe, and the emphasis on "need" makes me think Yankees.  2008-2009, the Yankees were in still "screw it, just throw cash around for everybody at the top of the FA rankings!" mode, and by the end of 2009 seemed just fine with that approach. 
 
I'm probably ready too much into it, though.  
 
I don't think the Yankees have the personnel to come up with anything to use it for yet. Of course, buying capacity they can't leverage would be consistent with this year's plan.
 
I'm thinking either Houston or Pittsburgh as teams that have been rapidly building an analytics approach where none existed five years ago.
 

GRPhilipp

Member
SoSH Member
Jan 30, 2007
87
This would be useful for deciphering the system of signs used by the opponent's manager and coaches, no?  The ethics could be debated, but I assume this could work and I also assume there would be significant value in knowing when the opponent called a pitchout, bunt, steal, or hit & run.
 

AlNipper49

Huge Member
Dope
SoSH Member
Apr 3, 2001
44,855
Mtigawi
GRPhilipp said:
This would be useful for deciphering the system of signs used by the opponent's manager and coaches, no?  The ethics could be debated, but I assume this could work and I also assume there would be significant value in knowing when the opponent called a pitchout, bunt, steal, or hit & run.
Screw signs. It could track breathing from a SP and help it determine what pitch is coming next.

More simply I can see it telling fielders exactly where to stand based on pitch type, environmental, hitter stats, etc

I've often thought why we've never seen this type of thing before (unless its illegal, which would be news to me)
 

Sampo Gida

Member
SoSH Member
Aug 7, 2010
5,044
AlNipper49 said:
Screw signs. It could track breathing from a SP and help it determine what pitch is coming next.

More simply I can see it telling fielders exactly where to stand based on pitch type, environmental, hitter stats, etc

I've often thought why we've never seen this type of thing before (unless its illegal, which would be news to me)
 
In the past the data was relatively poor, and GIGO applies, so no need for much computing power.  With the new field f/x coupled with pitch f/x and whatever other f/x is out there, the data may be at a level that super computing power may be useful, especially for quick in game decisions.
 

Plympton91

bubble burster
SoSH Member
Oct 19, 2008
12,408
Does anyone else think that the time might come where they will have to limit shifts in order to keep the game interesting in the same way that golf limits the equipment people can use or basketball limits zone defenses?
 
I would favor a rule that says something like the third baseman cannot be to the right side of midpoint between 2nd and 3rd base, and the shortstop (second baseman) must remain to the left (right) of second base until after the pitch is thrown, and no infielder can be more than one step behind the infield dirt.  Some of this shifting borders on "taking the enjoyment out of the game" as far as I'm concerned.  And, if it is behind the drop in offense, it is probably only a matter of time before something is done to restore that balance.
 
Of course the counter argument is that you can't shift on hitters that use the whole field, and those are typically the most skilled hitters, so why not reward them in some way.  Still, I don't think it's right when David Ortiz hits a line drive 40 feet into the outfield and then the second baseman throws him out at first base.
 

absintheofmalaise

too many flowers
Dope
SoSH Member
Mar 16, 2005
23,335
The gran facenda
Plympton91 said:
Does anyone else think that the time might come where they will have to limit shifts in order to keep the game interesting in the same way that golf limits the equipment people can use or basketball limits zone defenses?
 
I would favor a rule that says something like the third baseman cannot be to the right side of midpoint between 2nd and 3rd base, and the shortstop (second baseman) must remain to the left (right) of second base until after the pitch is thrown, and no infielder can be more than one step behind the infield dirt.  Some of this shifting borders on "taking the enjoyment out of the game" as far as I'm concerned.  And, if it is behind the drop in offense, it is probably only a matter of time before something is done to restore that balance.
 
Of course the counter argument is that you can't shift on hitters that use the whole field, and those are typically the most skilled hitters, so why not reward them in some way.  Still, I don't think it's right when David Ortiz hits a line drive 40 feet into the outfield and then the second baseman throws him out at first base.
No. Good hitters can take advantage of the shifts and why would you want to penalize teams for being smart? 
 

Flunky

Well-Known Member
Lifetime Member
SoSH Member
Jan 3, 2009
1,918
CT
Sampo Gida said:
 
In the past the data was relatively poor, and GIGO applies, so no need for much computing power.  With the new field f/x coupled with pitch f/x and whatever other f/x is out there, the data may be at a level that super computing power may be useful, especially for quick in game decisions.
 
you mean like the data MLBAM is rolling out this season?
 
http://mlb.mlb.com/news/article/mlb/mlbam-introduces-new-way-to-analyze-every-play?ymd=20140301&content_id=68514514&vkey=news_mlb
 
and as soxhop pointed out in the earlier thread about it, the possibility for it to be real time?
 
"The goal over time, and hopefully certainly by this season, is to make these plays available in real time and start the debates," Bowman said. "But we have to make sure baseball operations sees it and they agree that these are accurate renderings. But this year, fans will be able to see these data and these videos."
 

Plympton91

bubble burster
SoSH Member
Oct 19, 2008
12,408
absintheofmalaise said:
No. Good hitters can take advantage of the shifts and why would you want to penalize teams for being smart? 
One is personal preference. I get enough statistical analysis in my day job for 2 lifetimes and really don't like it much intruding on my sports viewing.

Second is because I'd rather watch David Ortiz hit line drives to right field than ground balls past an empty 3B.

Third shifting also can make the pitcher better than he should be.

In short, I don't want some stat geek like affecting the outcome of a game anymore than I want a bad umpire call affecting the game. Let the players talents shine through. Like I said, I understand that a good player can beat a shift with a well placed ground ball, but my response is a golf clap as I remain seated and take a sip of an adult beverage. The reaction to Ortiz granny in the playoffs was a bit different.
 

finnVT

superspreadsheeter
SoSH Member
Jul 12, 2002
2,153
I'm trying to figure out what you'd need a cray for that you couldn't do for a fraction of the cost with an off-the-shelf cpu/gpu cluster.  i would think that nearly any sort of statistical analysis would be performed plenty fast on the latter, or could be performed ahead of time.  i almost think you'd have to be doing some serious physics modeling for it to be worth it (i.e., how real-time conditions like humidity and wind interact with the speed & spin of the pitch when hit given the bat speed and angle used by the current hitter, in order to, say, know what pitches are going to be most effective at inducing a ground ball).
 

AlNipper49

Huge Member
Dope
SoSH Member
Apr 3, 2001
44,855
Mtigawi
finnVT said:
I'm trying to figure out what you'd need a cray for that you couldn't do for a fraction of the cost with an off-the-shelf cpu/gpu cluster.  i would think that nearly any sort of statistical analysis would be performed plenty fast on the latter, or could be performed ahead of time.  i almost think you'd have to be doing some serious physics modeling for it to be worth it (i.e., how real-time conditions like humidity and wind interact with the speed & spin of the pitch when hit given the bat speed and angle used by the current hitter, in order to, say, know what pitches are going to be most effective at inducing a ground ball).
A cray isn't what it is 20 years ago (although it can be). They most often are used in concert with traditional servers as a 'fat node' within a cluster. The logic behind it is how it brokers communication between the CPUs and memory (probably over simplifying it) but, yes, its usually done for performance reasons which is why I'm almost positive that it would be used to analyze and influence in-game decisions.

It seems expensive but if it wins a team half a game then its more than paid for the 3 million half a WAR goes for on the open market ;)
 

SumnerH

Malt Liquor Picker
Dope
SoSH Member
Jul 18, 2005
31,893
Alexandria, VA
finnVT said:
I'm trying to figure out what you'd need a cray for that you couldn't do for a fraction of the cost with an off-the-shelf cpu/gpu cluster. 
 
Modern Crays are exactly that.  The Cray Titan, which was the fastest supercomputer in the world for part of 2013, was about 19,000 each of AMD Opteron 6274 CPUs and NVIDIA Tesla K20X GPUs. There's no more custom CPU hardware like there was in the old XMP/YMP days, they're "just" a vendor of well-constructed massively parallel implementations of off-the-shelf hardware.
 

Adrian's Dome

Member
SoSH Member
Aug 6, 2010
4,424
SumnerH said:
 
Modern Crays are exactly that.  The Cray Titan, which was the fastest supercomputer in the world for part of 2013, was about 19,000 each of AMD Opteron 6274 CPUs and NVIDIA Tesla K20X GPUs. There's no more custom CPU hardware like there was in the old XMP/YMP days, they're "just" a vendor of well-constructed massively parallel implementations of off-the-shelf hardware.
 
I totally understood all of this.
 

terrisus

formerly: imgran
SoSH Member
Plympton91 said:
One is personal preference. I get enough statistical analysis in my day job for 2 lifetimes and really don't like it much intruding on my sports viewing.

Second is because I'd rather watch David Ortiz hit line drives to right field than ground balls past an empty 3B.

Third shifting also can make the pitcher better than he should be.

In short, I don't want some stat geek like affecting the outcome of a game anymore than I want a bad umpire call affecting the game. Let the players talents shine through. Like I said, I understand that a good player can beat a shift with a well placed ground ball, but my response is a golf clap as I remain seated and take a sip of an adult beverage. The reaction to Ortiz granny in the playoffs was a bit different.
 
So when does a shift become too much of a shift? Are fielders locked to specific spots on the field until the pitch is delivered?
 
Outfielders regularly adjust their position based on not only the hitter, but the type of pitch being thrown. Infielders play at different spots depending if there's a runner on base, as well as again the pitch and hitter tendencies. So is taking one step to the side alright? Why not 2? If 2 is fine, what's wrong with 3 steps? Where does one draw the line?
 
And, "Shifting can make the pitcher better than he should be?" What does that even mean? I mean, a pitcher is obviously going to have worse luck with an infield full of Derek Jeters over an infield full of José Iglesias. That's part of how Baseball works (and why stats such as FIP exist). So it's alright for a pitcher to look "better than he should be" because he has good fielders behind him, but not because his fielders are able to play in ideal positions? 
 

Sampo Gida

Member
SoSH Member
Aug 7, 2010
5,044
Plympton91 said:
Does anyone else think that the time might come where they will have to limit shifts in order to keep the game interesting in the same way that golf limits the equipment people can use or basketball limits zone defenses?
 
I would favor a rule that says something like the third baseman cannot be to the right side of midpoint between 2nd and 3rd base, and the shortstop (second baseman) must remain to the left (right) of second base until after the pitch is thrown, and no infielder can be more than one step behind the infield dirt.  Some of this shifting borders on "taking the enjoyment out of the game" as far as I'm concerned.  And, if it is behind the drop in offense, it is probably only a matter of time before something is done to restore that balance.
 
Of course the counter argument is that you can't shift on hitters that use the whole field, and those are typically the most skilled hitters, so why not reward them in some way.  Still, I don't think it's right when David Ortiz hits a line drive 40 feet into the outfield and then the second baseman throws him out at first base.
 
 
BABIP is still well above normal levels in the modern era (post 1930's), just a tick below the peak steroid years.  Also, shifting took place in Ted Williams and Yaz days, the main difference is it is being used for non-elite hitters as well.
 
As abs said, good hitters like Papi can hit against the shift, and last year Cano hit a bunt double against the shift.
 
I think the big factor in the reduction in offense is lower HR numbers and higher K rates, which may not be completely independent of one another.
 
The other thing about the shift is it relies on a pitcher hitting his spots.  It may make sense to shift on Hitter X with an inside fast ball, but of that pitcher misses location and throws it outside the shift may be less effective.
 
What the super computer may also do, with good enough data,  is allow hitters to guess right more often on what pitch to expect in a given situation and count, and with a given defensive positioning. The computer can spit out expected pitch and location based on a pitchers and teams tendencies, and the coaches can rely the signs to the batters inclined to use this.  This could actually boost offense.  To a certain extent  scouting reports allow this to be done already, but adjustments for situation, count and positioning may make it more effective.
 

Red Sox Physicist

Well-Known Member
Gold Supporter
SoSH Member
Jul 15, 2005
296
Natick, MA
finnVT said:
I'm trying to figure out what you'd need a cray for that you couldn't do for a fraction of the cost with an off-the-shelf cpu/gpu cluster.  i would think that nearly any sort of statistical analysis would be performed plenty fast on the latter, or could be performed ahead of time.  i almost think you'd have to be doing some serious physics modeling for it to be worth it (i.e., how real-time conditions like humidity and wind interact with the speed & spin of the pitch when hit given the bat speed and angle used by the current hitter, in order to, say, know what pitches are going to be most effective at inducing a ground ball).
 
 
SumnerH said:
 
Modern Crays are exactly that.  The Cray Titan, which was the fastest supercomputer in the world for part of 2013, was about 19,000 each of AMD Opteron 6274 CPUs and NVIDIA Tesla K20X GPUs. There's no more custom CPU hardware like there was in the old XMP/YMP days, they're "just" a vendor of well-constructed massively parallel implementations of off-the-shelf hardware.
There is one difference: the networking fabric. On modern supercomputers, the nodes are connected by Infiniband which allows much faster internode communication that standard ethernet does. This becomes important when running parallel jobs with significant internode communication such as modeling complex physical systems with long range interactions. These type of systems generally have high memory demands as well. GPUs are limited by the small amount of memory per core and the cost of transferring data across the bus. The types of systems I model (defects in metals using density functional theory) require about 2-4GiB of memory/CPU core to run most efficiently. The simulation software still needs to catch up to the new heterogeneous CPU/GPU archtiectures.
 
I am curious about what an MLB team would need a Cray for. Most statistical calculations can be distributed fairly easily without requiring lots of internode communication. They could probably get away with a few off-the-shelf nodes and an ethernet backbone for that type of thing.
 

SumnerH

Malt Liquor Picker
Dope
SoSH Member
Jul 18, 2005
31,893
Alexandria, VA
Red Sox Physicist said:
 
 
There is one difference: the networking fabric. On modern supercomputers, the nodes are connected by Infiniband which allows much faster internode communication that standard ethernet does.
 
Cray actually uses Gemini interconnects on current- and previous- generation machines.  It's a pretty cool architecture.  https://www.nersc.gov/users/computational-systems/hopper/configuration/interconnect/ has a writeup of the Gemini interconnect for an older generation machine (Hopper, which was the 5th fastest supercomputer in the world on the Top500 list in 2010), but it's mostly still applicable.
 

AlNipper49

Huge Member
Dope
SoSH Member
Apr 3, 2001
44,855
Mtigawi
Red Sox Physicist said:
 
 
There is one difference: the networking fabric. On modern supercomputers, the nodes are connected by Infiniband which allows much faster internode communication that standard ethernet does. This becomes important when running parallel jobs with significant internode communication such as modeling complex physical systems with long range interactions. These type of systems generally have high memory demands as well. GPUs are limited by the small amount of memory per core and the cost of transferring data across the bus. The types of systems I model (defects in metals using density functional theory) require about 2-4GiB of memory/CPU core to run most efficiently. The simulation software still needs to catch up to the new heterogeneous CPU/GPU archtiectures.
 
I am curious about what an MLB team would need a Cray for. Most statistical calculations can be distributed fairly easily without requiring lots of internode communication. They could probably get away with a few off-the-shelf nodes and an ethernet backbone for that type of thing.
Again, its probably not a standalone. Their most common application these days is as part of clusters, facilitating (or diverting) those same bottlenecks that you mention.

Crays aren't like 30 years ago. They're 'just another server' - outside of the aforementioned architectural differences - these days. You basically but them so that you can scale you commodity hardware easier in both operational and peak capacities. Think of it as buying a point guard in the NBA. You're not starting a team with just one player, the point guard, or conversely you're not starting it with five PGs.

Clearly someone wants data fast, that's the takeaway from this. It would he really cool if there was a behind-the-scenes arms race. I'm also sure that some teams may agree with you and are going Cray-less designs -- this is largely a result of the MSA signed between the team and Cray. Cray's announcement that a team is using them has material value.