Sabermetrics: Cabrera vs. Trout, Round 2

Last week, I entered the fray on the Mike Trout versus Miguel Cabrera AL MVP debate. It’s similar to the 2010 AL Cy Young discussion — Felix Hernandez led the AL in strikeouts and ERA but managed just a 13-12 record because Seattle couldn’t score. The new era of baseball stats won out. Voters ignored wins, which have little to do with pitching quality, and Hernandez won the award.

Likewise, Trout lags Cabrera in highly publicized but somewhat meaningless  stats (RBI, Triple Crown). Some saber-men would have you believe that Trout laps Cabrera in the only stats that matter (WAR over 10 compared to 7 for Cabrera), but that requires a level of trust that I don’t have. WAR — Wins Above Replacement — is complicated to the point of complete confusion. Cabrera contributed more in some categories (doubles, homers, total bases, batting average) but less in others (triples, baserunning, defense). Is WAR capturing these contributions accurately?

True Runs Revised (A WAR Replacement)

Rather than critique WAR (which would take days), I developed a new, simpler stat: True Runs. True Runs (named in honor of my True Wins football statistic) estimates a player’s contribution to his team based only on simple statistics. I got some good comments on the methodology, and what better time to revise it than now, while listening to MVP chants ring out at Comerica Park in Detroit.

Per DRDR’s comment, I included outs/reached on error in the revised methodology:

  1. Using data since 1990, regress total runs scored by each team each season on total singles, doubles, triples, homers, walks, hit by pitches, usual outs/reached on error, strikeouts, double plays, stolen bases, and caught stealing in that season
  2. Take the coefficients from this regression, multiply them by each individual’s stats, and add up the result

Intuitively, the regression finds the best way to add up all these stats to most closely approximate total runs scored across all teams in all years. The result: True Runs now captures the four basic things a hitter can do at the plate — walk, get a hit, make an out/reach on an error, strikeout — as well as steals. The regression coefficients approximate how many runs each of these actions is worth, on average.*

Here’s the top 10 for 2012 across both leagues (click to zoom in):

We’ve got a close one. Trout’s True Runs per plate appearance is slightly higher (0.19 to 0.18), but his True Runs is slightly lower (123.3 to 127.8) because he started 2012 in the minors. True Runs per plate appearance is important, since more runs spread out over many more games generates fewer wins. However, these guys are pretty much a dead heat according to True Runs.**

Who Wins?

My hope is that True Runs yields a simple and transparent starting point for the debate. By keeping it simple, I miss some important aspects of baserunning and I don’t adjust for park differences (as hhohw commented last week), defensive contributions, or opposition quality. Most likely, Trout wins out on these categories (especially defense and baserunning — Fangraphs breaks down every side of the debate, though I can’t find any explanation of their stats).

However, I also ignore things that matter to MVP voters — playoff qualification, rare achievements (Triple Crown), and clutch performance (RBI). Commenter Dan noted that players likely face harder pitchers/better effort in clutch situations, and Cabrera probably had more opportunities because of his place in the order. He did not back down from the challenge — his RBI total reflects that. These statistics may not measure true skill (whatever that means), but I think they are going to push Cabrera over the top with the voters.

True Runs Errata

The True Runs multipliers from the regression make a lot of sense. My favorite is strikeouts. They do nothing to create runs and cost your team a plate appearance. It just so happens that teams have scored about 0.12 runs per plate appearance since 1990, which is very close to the “cost” of a strikeout in True Runs: -0.13.

Singles are worth 0.56 True Runs, more than walks and hit by pitches (which are nearly identical at 0.36/0.35), since runners can advance farther on a single. A steal is worth even less (0.13), since it doesn’t move runners at all. Doubles, triples, and homers are worth progressively more, but a homer isn’t worth four times a single (any saber-man could have told you that).

The only multiplier that bothers me is the caught stealing number. It seems like this cost should be even higher than a strikeout, since it costs a plate appearance and a baserunner while adding nothing. Likewise, the value of a steal seems a little low. Perhaps teams steal when they are unlikely to score anyway (lowering the cost of a wasted plate appearance). I’m open to other suggestions to explain this, though.

If you’ve made it this far, I’ll reward you with True Runs for the remaining batting title-qualified hitters in 2012:

*For readers familiar with statistics, the R squared on this regression is 0.95, which means that the listed stats account for about 95% of the variation in runs scored. The remainder is likely due to baserunning primarly, as well as minor contributors such as balks, wild pitches, interference calls, etc.

**It’s interesting to note that Cabrera and Trout are quite close in oWAR as well (both in the 7s, Trout slightly higher because of his baserunning probably). I’ll try to compare/contrast oWAR and True Runs more in another post, but it seems like I’m getting something similar with FAR less work.


