AA and MLB hitting production by AA batters between 1995-2002
I felt like posting this for both BBB and Minor League Ball.
I put together a spreadsheet of all batters that hit in AA seasons between 1995 and 2002, and their MLB stats (minimum 100 MLB PA). I only included the batters' MLB production if they followed the criteria: The batters were less than 26 years of age and had a minimum of 400 PA in their AA season(s). This is the spreadsheet I put together using numbers from B-R and Fangraphs (Glove tap to those two):
https://docs.google.com/spreadsheet/ccc?key=0AnAFMTj7pea8dFBtVnZBSUMzYzQyYXM0UWx3a3BkZmc#gid=0
More after the jump.
I chose 100 MLB PA as the minimum because I didn't know what minimum people would want, so I just went ahead with 100 just to give people the choice themselves. I chose fWAR/100 games, since it was more convenient to use fWAR as opposed to rWAR.Just to note, a player's wOPS+ = 1.8*OBP + SLG, which was then league adjusted (though, not park adjusted). I thought wOPS+ was a good proxy for wRC+, since Fangraphs doesn't show MiLB wRC+ prior to the 2006 season. I also estimated a player's contact rate by using the following formula: estContact% = (AB-K)/AB.
Just for fun, I put together some data (for AA batters with a minimum of 400 career MLB PA):
I separated batters into four age categories: 18-21, 22, 23, 24-25 (there was only one 18 year old AA batter that qualified, whom of which was Edgar Renteria in 1995). The number of AA batters that had a minimum of 400 career MLB PA were as follows:
- 18-21: 73 qualified batters out of 137 total AA batters (53.3%)
- 22: 63 out of 151 (41.7%)
- 23: 57 out of 217 (26.3%)
- 24-25: 70 out of 421 (16.6%)
The average fWAR/100 of the qualified batters in each age group were as follows:
- 18-21: 0.91 fWAR/100
- 22: 0.98
- 23: 0.63
- 24-25: 0.66
Out of curiosity, I also wanted to see which MiLB stats that I was interested in (K%, BB%, wOPS+, BB/K) correlated the most in each age group with the following: MLB K%, MLB BB%, wRC+, fWAR/100, MLB BB/K. Some should be obvious, but I wanted to look at how strong the relationships were. I will use the correlation coefficient (R) to determine the relationship between the stats. These were the following results (I'll have to check for p-values later):
18-21:
MLB K% correlated most with MiLB K% (R = 0.801)
MLB BB% correlated most with MiLB BB% (R = 0.755)
wRC+ correlated most with wOPS+ (R = 0.430)
fWAR/100 correlated most with BB/K (R = 0.317)
MLB BB/K correlated most with BB/K (R = 0.701)
22:
MLB K% correlated most with MiLB K% (R = 0.703)
MLB BB% correlated most with MiLB BB% (R = 0.674)
wRC+ correlated most with wOPS+ (R = 0.455)
fWAR/100 correlated most with BB% (R = 0.318); wOPS+ was close behind (R = 0.315)
MLB BB/K correlated most with BB/K (R = 0.630)
23:
MLB K% correlated most with MiLB K% (R = 0.565)
MLB BB% correlated most with MiLB BB% (R = 0.568)
wRC+ correlated most with wOPS+ (R = 0.340)
fWAR/100 correlated most with BB% (R = -0.217)
MLB BB/K correlated most with BB/K (R = 0.450)
24-25:
MLB K% correlated most with MiLB K% (R = 0.731)
MLB BB% correlated most with MiLB BB% (R = 0.655)
wRC+ correlated most with wOPS+ (R = 0.487)
fWAR/100 correlated most with BB% (R = 0.352)
MLB BB/K correlated most with BB/K (R = 0.661)
Total:
MLB K% correlated most with MiLB K% (R = 0.716)
MLB BB% correlated most with MiLB BB% (R = 0.671)
wRC+ correlated most with wOPS+ (R = 0.400)
fWAR/100 correlated most with BB/K (R = 0.198)
MLB BB/K correlated most with BB/K (R = 0.626)
A few obvious issues is that I haven't done determined whether the correlations are significant (p<0.05) or not, so take these values with a grain of salt. As well, which ties in with the significance issue, the sample sizes were somewhat small for my liking.
Nonetheless, the main emphasis of this was to just put together a spreadsheet of AA batter stats in seasons between 1995-2002 and their MLB stats. This spreadsheet took a lot of time and effort on my part, and my wrists are killing me. =P
What do you think of the spreadsheet I put together?
5 comments
|
Add comment
|
3 recs |
Do you like this story?
Comments
just want to double check
but you are saying Major League wRC+ correlated best with minor league wOPS+ for every age bracket right?
by blue bulldog on Jan 31, 2026 9:32 PM EST reply actions
Indeed it did
Mind you, the sample size was somewhat small.
"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr
Sorry, unauthorized hotlinking of copyrighted material not permitted.
by Frag on Jan 31, 2026 9:42 PM EST up reply actions
well
additionally, there were only four potentially determinant variables, right? That’s not a bad thing necessarily, but the other three are quite different, so it makes sense that wOPS+ turned out to be the most correlated. Other variables more similar to wRC+ might have taken the cake instead, had they been included.
That’s a great amount of work you’ve done here. Very cool. I’d enjoy seeing how these would come out in a multivariate model, with age as a covariate.. Perhaps you’ve done this, but it’s not clear from the writeup how you were arriving at the correlation coefficients. Did you separate the data by age class first, the run each of the four determinant variable separately against a result variable to see which of the four had the strongest correlation? It kind of looks that way from your reporting, but not certain.
Kudos to you also for posting your spreadsheet.
by siddfynch on Jan 31, 2026 9:58 PM EST up reply actions
I wanted to leave most of the wRC+ (plus others) work for others, which was why I posted the spreadsheet. I just wanted to look at a few variables out of fun.
Perhaps you’ve done this, but it’s not clear from the writeup how you were arriving at the correlation coefficients. Did you separate the data by age class first, the run each of the four determinant variable separately against a result variable to see which of the four had the strongest correlation? It kind of looks that way from your reporting, but not certain.
That’s what I did. I probably should have stated that more clearly in the post.
Thank you! =)
"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr
Sorry, unauthorized hotlinking of copyrighted material not permitted.
by Frag on Jan 31, 2026 10:07 PM EST up reply actions
OK then
not sure what your fluency with stats is, but from this and other posts you’ve made recently, it sounds like you have some access to stats packages and are learning as you go? Try running wRC+ as a multivariate model, the original 4 determinant vars as the vars (together), and with age as a covariate. You’re already set up, and this would be a nice step beyond looking at the R^2 of the variables separately.
by siddfynch on Feb 1, 2026 12:28 AM EST up reply actions
Something to say? Choose one of these options to log in.

- » Create a new SB Nation account
- » Already registered with SB Nation? Log in!

by Frag on 













