I just hope that Cray was on Ivytown. Opteron can't keep up. Xeon + Xeon PHI owns the #1 supercomputer in the world, and that is probably not going to change anytime soon.
AlNipper49 said:Again, its probably not a standalone. Their most common application these days is as part of clusters, facilitating (or diverting) those same bottlenecks that you mention.
Crays aren't like 30 years ago. They're 'just another server' - outside of the aforementioned architectural differences - these days. You basically but them so that you can scale you commodity hardware easier in both operational and peak capacities. Think of it as buying a point guard in the NBA. You're not starting a team with just one player, the point guard, or conversely you're not starting it with five PGs.
Clearly someone wants data fast, that's the takeaway from this. It would he really cool if there was a behind-the-scenes arms race. I'm also sure that some teams may agree with you and are going Cray-less designs -- this is largely a result of the MSA signed between the team and Cray. Cray's announcement that a team is using them has material value.
Plympton91 said:One is personal preference. I get enough statistical analysis in my day job for 2 lifetimes and really don't like it much intruding on my sports viewing.
Second is because I'd rather watch David Ortiz hit line drives to right field than ground balls past an empty 3B.
Third shifting also can make the pitcher better than he should be.
In short, I don't want some stat geek like affecting the outcome of a game anymore than I want a bad umpire call affecting the game. Let the players talents shine through. Like I said, I understand that a good player can beat a shift with a well placed ground ball, but my response is a golf clap as I remain seated and take a sip of an adult beverage. The reaction to Ortiz granny in the playoffs was a bit different.
finnVT said:I'm trying to figure out what you'd need a cray for that you couldn't do for a fraction of the cost with an off-the-shelf cpu/gpu cluster. i would think that nearly any sort of statistical analysis would be performed plenty fast on the latter, or could be performed ahead of time. i almost think you'd have to be doing some serious physics modeling for it to be worth it (i.e., how real-time conditions like humidity and wind interact with the speed & spin of the pitch when hit given the bat speed and angle used by the current hitter, in order to, say, know what pitches are going to be most effective at inducing a ground ball).
ean611 said:I just hope that Cray was on Ivytown. Opteron can't keep up. Xeon + Xeon PHI owns the #1 supercomputer in the world, and that is probably not going to change anytime soon.
“Urika is unique in that it’s a global shared memory machine that lets you look at data in an unpartitioned fashion. This is very critical if you’re looking at graphs, which by nature are unpredictable. Further, certain graphs are non-partitionable—and if you do partition it, it changes the result of a query,” White explained. “There is no MapReduce job or partitioning that will do anything but fracture the graph to a point where it’s no longer reconstructable—and even for those you can reconstruct, it would take a lot of compute power.” Where this works is with memory-bound problems versus those that are compute-bound, in other words.
Urika is also fitting for the big data of MLB given the disparate data sets required to piece together a best-case-scenario for team leaders. There are lots of sources and combining that data requires a data structure that allows for federated queries. This is exactly the reason big pharma and a few others find RDF machines useful (in the case using SPARQL queries). “You could go ahead an do the equivalent of a hundred-way join from a relational database—the question is, how big of a dataset can you do that against?” asked White. “Unless you have something like Urika, which has the ability to do it memory and with massive multi-threading, you’re not able to look at enough data.” He said that when compared to what they’re doing inside Urika, for normal relational databases, this would be the equivalent of a 30-50 way join. Pulling from the large shared memory pool using SPARQL queries offers a more seamless blending of conditions to hypothesize against. And herein lies the selling point for operational budget-constrained MLB.