Spaghetti Baseball
Spaghetti Baseball
By John Sickels
Saturday, June 13, 1976. I was eight years old. My parents decided that it was time for me to spend some time away from home for the first time, so it was off to summer camp at the Episcopal Center for Camps and Conferences north of Des Moines.
The first thing I remember that Saturday morning was walking out of the house into what felt like a blast furnace: it was hot, humid, windy.
Mid-June is the heart of tornado season in central Iowa, and on the afternoon of June 13th the atmosphere would explode.
On the way to camp, we drove through a hellacious thunderstorm including large hail and fierce winds. After awhile, we broke out into a clear area of the storm. And off in the distance, we saw this:

Tornado at Jordan, Iowa, June 13, 1976 (Iowa State University photo)
It was an F5 tornado, and one of the most powerful ever recorded in modern times. Dr. Theodore Fujita, renowned tornado researcher and developer of the famous F-scale for tornado damage, once remarked that this particular tornado was the strongest he had ever studied. The tornado hit a small town called Jordan, annihilating it. Remarkably, no one was killed. The tornado stayed in rural areas, which was most fortunate. A shift of just a few miles would have brought this monster through the heart of Ames, Iowa.
Witnessing this thing had quite an impact on my impressionable young mind. I decided that I wanted to be a "weatherman," a meteorologist. I read everything I could find about severe weather, thunderstorms, tornadoes. It was one of my biggest passions as a child and teenager, along with baseball.
Unfortunately, once I got into high school, I discovered that I was not very good at advanced math. And I was especially bad at physics. I could understand the general theory behind everything, but when it came down to pencil and paper and formulas and a scientific calculator, my mind would blank. I eventually came to the realization that I wasn't cut out to be a real meteorologist, so it just became another hobby.
The internet is a boon to severe weather nuts like myself. When I got on line in 1996, I discovered a wealth of information available, things I could only have dreamed of having access to previously: model outputs, mesoscale discussions, convective forecasts, raw severe weather data, etc. No longer did I have to rely on local TV to let me know when a tornado watch was up: I could read the watch prediction itself right off the net, including the detailed reasons why the forecasters felt the watch was needed.
The second thing I do every morning during the spring and summer, after checking the baseball scores, is to log onto the Storm Prediction Center website and check out the risks for severe weather in Kansas. Over the last few years, I've learned more and more about the various computer models and forecasting tools.
Now, understand, I wouldn't even call myself an "amateur meteorologist." I understand this stuff a lot more than I did five or ten years ago, but when the discussions start getting too technical my eyes still bleed. I can look at the SPC Mesoscale Analysis page and figure out where the big danger spots are. . .I can tell you to watch for the spots with 4000 CAPE, but I'm still can't make sense of a sounding or a hodograph on my own.
OK, enough jargon dropping. So what does any of this have to do with baseball?
Weather is a natural system. It is somewhat predictable, if you have enough data. Baseball players (and human beings in general) are also natural systems, and with enough data they are also somewhat predictable.
Meteorologists use computer models when making their forecasts. Each model uses a different set of assumptions in taking a data set and projecting it out into the future. If you poke around the internet, you will find charts like this one:

This is a chart of "model output ensembles," commonly called a "spaghetti diagram." Each line on the chart represents the output of a different computer model, in this case projecting the flow of the jet stream. A meteorologist making a forecast will consult different models that use different assumptions, to get an idea of the possible outcomes of the current situation.
There is more to it than just that, however. No good meteorologist will rely only on the computer data: there is a place for intuition, instinct, "gut feeling" if you will. I personally believe that what we call "intuition" is often an expression of subconscious pattern recognition on the part of the human mind.
Baseball player prediction systems like PECOTA or ZIPS operate on the same basic assumption as weather models. They are much less complex, of course, since there are fewer variables to consider. The parallels can only be drawn so far between weather and baseball, but they are there.
Of course, even the best model and the best human forecaster screws up sometimes. "High Risk" severe weather days sometimes result in nothing but a bit of wind and lightning. . .all the known parameters come together, but something just isn't quite right. . .perhaps the wind shear was less than forecast, or a cirrus shield prevented the atmosphere from destabilizing. In baseball, this is the "can't miss" prospect who misses, sometimes due to injury, and sometimes for no obvious reason at all. But more often than not, modern computer models can predict severe weather outbreaks days in advance. And more often than not, we can get a very good idea about what a player will do in the future through intelligent modeling and a bit of intuition.
This is why we are doing the Community Projection project. If you sit down and think about it, anyone who is 1) reasonably intelligent and 2) a baseball fan with a decent knowledge base, has what amounts to a complex baseball algorithm in their head. The theory here is if we can take 30 or 40 or 60 personal projections about a player, and put them together, that it would be like drawing a spaghetti diagram of the possible outcomes for a specific player.
The key is that you have to take it seriously: you can't just slop down some numbers and say that Austin Kearns will hit .340 with a .700 SLG. But, if this spaghetti baseball theory is accurate, serious, thought-given projections from enough fans should be as good as any computer model at predicting player outcomes. . perhaps better.
0 recs |
28
comments
Comments
NOT ENOUGH DATA??!!! NO PROBLEM!!!
MAYBE YOU REALLY ARE JAYNES' "HONEST WEATHERMAN"
http://omega.albany.edu:8008/JaynesBook.html
SEE CHAPTER 13...
by TOLAXOR on Mar 17, 2005 12:52 PM EST 0 recs
Jordan tornado
Tornadoes fascinate me as well. I never had the desire to be a weatherman, but I've been through two myself: one in Huntsville, Alabama, in 1974, and the second in Fort Worth in 1984. I'll never forget seeing that one as long as I live, and it was an F1 at best. I can't imagine being face to face with an F5.
I completely sympathize with your problems with higher math and physics. I'd always liked history better anyway, but after running into pre-calculus and physics my junior year of high school, I took the same path you did. Thankfully, the BA and MA programs in history at UT-Arlington didn't require much math or science!
by RCCook on Mar 17, 2005 1:08 PM EST 0 recs
tornadoes
by John Sickels on Mar 17, 2005 1:13 PM EST 0 recs
sweet
by Cabbage on Mar 17, 2005 1:26 PM EST 0 recs
Projections
On another note, your ads are blocked by my [and I suspect others] corporate network.
by irwin on Mar 17, 2005 1:29 PM EST 0 recs
Great Job John
by ohad on Mar 17, 2005 1:55 PM EST 0 recs
DEPENDS ON WHAT YOU DO WITH IT
EVERY MODEL IS SUBJECT TO "GARBAGE IN/GARBAGE OUT", BUT GIVEN A STANDARD TAXONOMY (WHICH HE'S PROVIDING IN THE SPECIFIC STATISTICS) COULDN'T WE THEN TAKE OUR PROJECTIONS AND PERFORM SOME LEVEL OF INFERENCE BASED PROBABILITY ANALYSIS???!!! THEN, REPEATED SEVERAL TIMES, WE CAN MAKE A DETERMINATION OF WHETHER OUR COMMON SENSE IS GREAT OR NOT!!!!!!
by TOLAXOR on Mar 17, 2005 2:22 PM EST 0 recs
But does it help?
If you put 50 weathermen in a room (no SDS jokes, please) and you take the average of their prediction, you may well be wrong more often than not unless you live in Southern California. You will certainly miss the outliers.
I mean, this is a fun process and all, but it's intuitively hard for me to believe that it's going to yield better prediction results.
by studes on Mar 17, 2005 2:24 PM EST 0 recs
point
by John Sickels on
Mar 17, 2005 2:45 PM EST
up
0 recs
good article, be careful what you wish for...
By the way, I was born in Storm Lake, Iowa which is northwest of Ames in the same general part of the state. One reason tornadoes are so devastating it that area is the relatively flat land there, they tend to stay on the ground a long time. If you are in Kansas you probably experience the same thing.
The name Storm Lake comes from trappers who found when they tried to camp at the lake they had a hard time keeping their gear from blowing away. There used to be a minor league or semipro team there, the Storm Lake White Caps.
When I was a kid in the 60's I went to a college game in Storm Lake with my uncle and by accident we wound up sitting next to a scout for the St Louis Cardinals. I remember he showed us a diagram of the new ballpark they were building, Busch Stadium. I also remember they guy he was scouting, a shortstop, hit a home run and later struck out. When the guy struck out he threw his bat over top of the dugout. When the coach came up into the stands to talk to the scout, the scout was as interested in the guy throwing the bat as he was in the home run. He wanted to know if the guy threw his equipment a lot.
Thanks for the article.
by alstl04 on Mar 17, 2005 2:29 PM EST 0 recs
Hey John
by ohad on Mar 17, 2005 2:45 PM EST 0 recs
article
by rdiersin on Mar 17, 2005 2:53 PM EST 0 recs
predictions
I think we've been given incredible minds that can do a lot of things we don't often realize, and taking loads of info and quickly formulating it into a general impression is one of them.
I think this is a great experiment, John.
by wijamie on Mar 17, 2005 3:41 PM EST 0 recs
like a futures market
It seems to me that this kind of "noise" is hard to eliminate in an open, free forum (especially if you're going to alow Mets fans to vote). Now if people had to put something at stake like...money, they would take it more seriously.
Poindexter wanted to set up a terrorism futures market a few years ago. The public/congress didn't receive it well, but, ultimately, he thought it would help provide better predictions than what they had. For a variety of reasons, I didn't think that was correct. Setting aside the revolting concept of the government paying someone for "winning" a bet predicting a terror attack (do you think that would come up in an election year), anyone who won such a bet would immediately be placed under suspicion and investigated/arrested. This would seem to keep people who are "in the know" out of the game.
I'm not sure this would apply to a baseball futures market. Unless they let Pete Rose back in as a manager; "Put $5,000 on Paul Wilson to pitch 250 innings!!!"
by chunkylover22 on Mar 17, 2005 4:02 PM EST 0 recs
yes
by John Sickels on Mar 17, 2005 4:11 PM EST 0 recs
but of course
Some fans are just fans of certain players, and hold them in higher regard than others, therefore possibly seeing more potential than others.
You can't just use stats to contemplate a prospects future, and you definitely wouldn't just use guy instinct or scouting reports. So to get a perfect combination of both you would need subjective evalutaion from a non-bias stathead.
by JFP on Mar 17, 2005 4:59 PM EST 0 recs
IT'S A PRETTY OLD PRACTICE
AS FAR AS THE NOISE, A FAIRLY "SIMPLE" (RELATIVE WORD, EH?) BAYESIAN ANAYLSIS WOULD BE ABLE TO TEST THE MODEL, "NOISE" (METS FANS, OVER-AMBITIOUS INPUT GIVEN BY MR. AND MRS. KEARNS) SOURCES WOULD BE IDENTIFIED AND FALL OUT PRETTY EASILY!!!!!!
by TOLAXOR on Mar 17, 2005 5:18 PM EST 0 recs
Why are you yelling?
17th century Amsterdam? Don't get me wrong, I'm a huge fan of Amsterdam, but what projections were they selling?
by chunkylover22 on
Mar 17, 2005 6:00 PM EST
up
0 recs
Amsterdam
by Flynn Blake on
Mar 17, 2005 8:56 PM EST
up
0 recs
Bayes Theorem
As I understand it, it states that a higher frequency of an event in the past leads to a higher probability of a future occurence
However, I could very well be quite wrong. After all, there was a good reason I am an attorney and not a mathematician.
by irwin on
Mar 17, 2005 10:13 PM EST
up
0 recs
Range
Also, just a comment on mathematical projections in general. You can break down a system into a predictable part (signal) and a random part (noise). The measure of a prediction system is more a measure of how the noise and signal compare to each other. If the noise dwarfs the signal part, then there isn't much you can do, no matter how much you want. Likewise, in relatively little noise you can do a great job. The question is just how noisy baseball numbers are in general, which I have you to see any strong published studies on the matter (at least in free content). This line of thinking is what motivates my earlier comments...
by dschonbe on Mar 17, 2005 6:00 PM EST 0 recs
Weather
I am interested in weather. I live in SW Florida, so I get afternoon thunderstorms in the summer and some tropical weather and hurricanes and that crap in the summer and early fall. What really got me interested in weather was the 1995 hurricane season where I got to drive through a hurricane (Erin) on the way to Ohio and then coming back being welcomed by Tropical Storm Jerry that dropped 16 inches of rain in about a day. Floods are quite awesome. There really never are destructive, major floods in Florida since the land is flat and that's probably a good thing. I have never seen a tornado though. I'm in my junior year of high school and am taking pre-calculus and physics and am doing pretty well and find it interesting.
by ultxmxpx on Mar 17, 2005 6:03 PM EST 0 recs
GAUSSIAN CURVES, NOISE, AND BAYESIAN ANALYSIS
IT'S A VERY BASIC BAYESIAN MODEL, ISN'T IT???!!!
THE ONLY THING I WORRY ABOUT, IS THAT READERS AREN'T MAKING THEIR DECISIONS IN A VACUUM, AND HOW MUCH OF THE READER PROJECTION DATA IS INFLUENCED BY PROJECTIONS THEY'VE ALREADY SEEN VS. "INTUITIVE INFERENCE"!!!!!
by TOLAXOR on Mar 17, 2005 9:55 PM EST 0 recs
Noise...
Edmonds is a great example of that, he played with an injury in his chest which effected his swing in '03 and his numbers reflected that. If you didn't know he was injured and just looked at his numbers you would be surprised at the numbers he put up in '04 when he was healthy. I guess you could say knowlege of his injury would better enable you to reduce the "noise" in his numbers.
I'm purely a "stathead" when it comes to minor leaguers, that's why I find sites like this so helpful. John Sickels breaks down things like the "stuff" on the top pitching prospects. There are pitchers who have impressive minor league numbers but don't have a pitch they can throw by Major League hitters.
by alstl04 on Mar 17, 2005 10:57 PM EST 0 recs
combining projection systems
by joshua on Mar 18, 2005 11:29 AM EST 0 recs
yes
Like I said, this is all an experiment. I want to see what works and what doesn't.
by John Sickels on Mar 18, 2005 11:34 AM EST 0 recs







