Tag Archives: simple sabermetrics

Sabermetrics: Cabrera vs. Trout, Round 2

Last week, I entered the fray on the Mike Trout versus Miguel Cabrera AL MVP debate. It’s similar to the 2010 AL Cy Young discussion — Felix Hernandez led the AL in strikeouts and ERA but managed just a 13-12 record because Seattle couldn’t score. The new era of baseball stats won out. Voters ignored wins, which have little to do with pitching quality, and Hernandez won the award.

Likewise, Trout lags Cabrera in highly publicized but somewhat meaningless  stats (RBI, Triple Crown). Some saber-men would have you believe that Trout laps Cabrera in the only stats that matter (WAR over 10 compared to 7 for Cabrera), but that requires a level of trust that I don’t have. WAR — Wins Above Replacement — is complicated to the point of complete confusion. Cabrera contributed more in some categories (doubles, homers, total bases, batting average) but less in others (triples, baserunning, defense). Is WAR capturing these contributions accurately?

True Runs Revised (A WAR Replacement)

Rather than critique WAR (which would take days), I developed a new, simpler stat: True Runs. True Runs (named in honor of my True Wins football statistic) estimates a player’s contribution to his team based only on simple statistics. I got some good comments on the methodology, and what better time to revise it than now, while listening to MVP chants ring out at Comerica Park in Detroit.

Per DRDR’s comment, I included outs/reached on error in the revised methodology:

1. Using data since 1990, regress total runs scored by each team each season on total singles, doubles, triples, homers, walks, hit by pitches, usual outs/reached on error, strikeouts, double plays, stolen bases, and caught stealing in that season
2. Take the coefficients from this regression, multiply them by each individual’s stats, and add up the result

Intuitively, the regression finds the best way to add up all these stats to most closely approximate total runs scored across all teams in all years. The result: True Runs now captures the four basic things a hitter can do at the plate — walk, get a hit, make an out/reach on an error, strikeout — as well as steals. The regression coefficients approximate how many runs each of these actions is worth, on average.*

Here’s the top 10 for 2012 across both leagues Continue reading

Cabrera Might Get the Triple Crown, but Does He Deserve the MVP?

Edit: Please see my later post as well, which corrects an omission here.

Miguel Cabrera has a shot at the Triple Crown this year. No one has done it since Carl Yastrzemski. Is it really possible that he could win the Triple Crown and not win the MVP? Well, yes. Every advanced stats guy out there is trumpeting Mike Trout for MVP, with his “wins above replacement” (WAR) above 10 (next best in the majors is 6.8) and his 13 “total zone total fielding runs above average” (basically, this is the number of runs he has saved with his fielding, compared to an average fielder).

The discussion is eerily similar to the AL Cy Young conversation in 2010. Felix Hernandez won because he led the AL in innings pitched, ERA, and, most importantly, WAR,  even though his win-loss record was a mediocre 13-12.

The 2010 Cy Young was a victory for sabermetricians. Pitchers can’t control how many runs their offense scores. All they can do is put up a low ERA and stick around for as many innings as possible. Strikeouts help too, since they reduce the risk of errors, and walks hurt, since fielders can’t do anything about a walk. There might be some cases where pitchers rise to the occasion in a close game to get a win, but for the most part, getting a “win” has little to do with pitcher skill after accounting for pitchers’ direct performance statistics.

2012 MVP: the Saber-Men After Party?

This time around, sabermetric thinking is stacked heavily against Cabrera (and the media is paying attention):

• RBIs are meaningless. After accounting for total bases and on base percentage in some way, RBIs have little to do with individual skill
• Cabrera LEADS THE AL IN DOUBLE PLAYS with 28, which is not captured by any traditional stat (granted, he has Austin Jackson’s high OBP in front of him, so he has lots of chances)
• Trout steals lots of bases and never gets caught (46 for 50 this year), which also isn’t captured by traditional metrics
• Cabrera is a poor fielder (10 runs worse than average at third base), Trout is a good fielder (mentioned above)

All these factors lead to Trout’s 10.4 to 6.7 WAR advantage over Cabrera. If voters take these numbers seriously, it seems that we’ll be looking at another win for the number crunchers.

But What is WAR Anyway?

Four extra wins is a lot and WAR is widely accepted as meaningful, but before I leap on the Trout-wagon, is WAR actually a good statistic? Here’s a snippet from Baseball Reference’s WAR explanation:

There is no one way to determine WAR. There are hundreds of steps to make this calculation, and dozens of places where reasonable people can disagree on the best way to implement a particular part of the framework.

Uh oh . . . hundreds of steps is never a good sign, Continue reading