This is a limited preview of a detailed analysis that looks at how these factors (and a few others) have changed since Low A minor league baseball for recent AA pitchers who typically start. It is being posted to secure feedback from what would be the future target audience of a more complete finished product. Please do have a read through and critique or suggest away in the Comments. Thank you.
- To quantify how the strikeout rate and walk rate of AA pitchers who mostly start changed in going from Low A to High A to AA.
- To determine whether the leagues that pitchers pass through influence the level-to-level changes that result in those parameters.
For a pitcher to be eligible, they had to satisfy all 3 of these criteria:
- Threw > 50 total innings at AA during 2011-12, averaging >3 IP/G
- Threw > 50 total innings at High A during 2009-12, averaging >3 IP/G
- Threw > 50 total innings at Low A during 2009-12, averaging >3 IP/G
Data Tracked and Statistical Analyses
For each player, the following were determined for each of 3 levels (Low A, High A, AA):
- K% (=strikeouts divided by plate appearances)
- BB% (=walks divided by plate appearances)
For each player, the following were then computed using the above data:
- Net Increase in K% from Low A to High A (=High A K% minus Low A K%)
- Net Increase in K% from High A to AA (=AA K% minus High A K%)
- Net Increase in K% from Low A (to High A) to AA (=AA K% minus Low A K%)
- Net Increase in BB% from Low A to High A
- Net Increase in BB% from High A to AA
- Net Increase in BB% from Low A (to High A) to AA
Those 6 Net Increases were then compared between the following groups of pitchers using simple pairwise statistical analyses (T-tests):
- Midwest Leaguers vs South Atlantic Leaguers (determined via the Low A League of most or all innings)
- California Leaguers vs Carolina Leaguers vs Florida State Leaguers
- Eastern Leaguers vs Southern Leaguers vs Texas Leaguers
The 134 starting pitchers ("SP") who met the study criteria averaged 111 total innings (5.1 innings/game) at Low A, 127 total innings (5.3 innings/game) at High A, and 133 total innings (5.3 innings/game) at AA. Here’s their data:
While this sort of data doesn’t yield an accurate prediction of a Low A pitcher’s AA rates in these metrics, they do provide useful limits for how much improvement or decline could result by the time the pitcher is a AA starter. Doubling the standard deviations and adding/subtracting that number from the mean establishes the 95% confidence interval limits, and one can be 95% confident that the player will improve or worsen by no more or less than those 2 values. As an example, in going from Low A to AA via a reasonable stop in High A, a starter’s AA K% should exceed their Low A K% by no more than 6.0% (=-2.8%+2*4.4%) and trail it by no more than 11.6%. Per the table, on average, a AA starter’s K% should be almost 3 percentage points below their Low A K%, and that allows an analyst or fan to put a given AA starter’s rise or decline since Low A in some perspective.
Analysis of League Effects
Statistically significant associations were sometimes identified between the Net Increase in K% or BB% over a one-level jump and one or more leagues, but the corresponding benefit/detriment to the pitching stat was usually diluted by what had transpired in the one-level jump that preceded or followed it. There was just one significant association that persisted in the path from Low A through High A to AA and it was between the Net Increase in K% and the pitcher’s High A league. Here’s that data in a table:
What the table shows is that a AA starting pitcher who passed through the California League, on average, had a 2% lower net drop in K% relative to their Low A K% than one who passed through the Carolina League and a 3% lower net drop in K% relative to their Low A K% than one who passed through the Florida State League. The former California Leaguers were also 2 times more likely than the other High A league graduates to have a AA K% that exceeded their Low A K%. Comparing the Net Increase in K% stats of the California Leaguers to those of the pitchers from each of the other High A leagues showed that the differences were statistically significant (California vs Carolina League, p=0.012; California vs Florida State League, p<0.001), while no significant difference existed between the Carolina and Florida State League data (p=0.257). Overall, the average Low A K% for a future California Leaguer was zero to one percentage point (non-significantly) below that of the future Carolina and Florida State Leaguers, while the California League graduates’ average AA K% was two percentage points (significantly) above that of the graduates of the other 2 leagues.
Having recently made the discovery, I can conjure up 4 potential explanations.
- The finding is just a random coincidence, even in spite of a very high level of statistical significance.
- The organizations that have had California League affiliates generally have better arms to work with than the teams that don’t.
- The tougher offense-biased environments found in the California League culled out the weaker starters and pitch-to-contact artists before they reached AA, while similar pitchers tended to graduate the other High A leagues and start in AA.
- The capacity of California League starters to strike out batters was boosted during their tenure in the league by some combination of facing aggressive swingers and the perils associated with too much contact, and this learned skill carried over to AA.
Most to all of this is probably on the horizon for the offseason:
- further scrutiny of the attrition rate and performance of starters passing through the 3 High A leagues to get some answers as to why the California Leaguers are better maintaining their K%
- elimination of study inclusion criteria #2 to admit starters who bypassed High A entirely or threw fewer than 50 innings there to examine the net changes in their K% and BB% from Low A to AA
- expansion of data tracked by level to include batted ball type (groundball, flyball, line drive, etc.)
- a transition to more robust statistical methods to sort out the individual league effects given the multifactorial nature of these sorts of phenomena, with an eye towards adding inning totals at level and age at level as another set of independent variables
- extending the study timeframe back one year to grow the population and increase statistical power, effectively laying the foundation for determining whether the identified league-related biases persist beyond AA
What variables or comparisons would you like to see worked into the analysis? Have you seen similar data published or presented elsewhere? To what would you attribute the elevated strikeout rates of the AA starters who passed through the California League?