Predicting the Super Bowl winner

With the Super Bowl ten days away, I thought it would be fun to look at some statistics for the winners in the last 25 Super Bowls (excluding the strike year in 1987). For each winner and loser, I list their regular season True Wins, regular season actual wins, regular season wins the previous season, and the Super Bowl line for the winner (lines from Vegas Insider). In the previous season column — “Last Wins” — I also list each team’s playoff performance from the year before in parentheses: a number indicates the round that they exited, “n” indicates that they didn’t make the playoffs, and “!” indicates that they won the Super Bowl. Here are the data:

Green shading represents a correct prediction. For example, I shade the “Wins” column green when the winner had more wins than the loser and I shade the “Line” column green when the winner was favored by Vegas. Yellow shading represents no prediction (i.e., the relevant statistic was the same for both teams), and white shading represents an incorrect prediction.

Previously, I showed some stunning results for playoff and Super Bowl prediction with True Wins since 2002. The playoff team with the most True Wins has won four of the last nine Super Bowls, and the second ranked team by True Wins has won two more (the Patriots can make it three this year). These results stand up over this longer time period as well. True Wins has correctly predicted 20 of the last 25 Super Bowls. By the Vegas line, the “favorite” has won only 19 out of 25. These results suggest that True Wins, which corrects for luck in close games, is a valuable measure of team quality.

Picking the winner based on actual win-loss record does much worse. Only 12 of the last 25 Super Bowl winners had more wins than the loser; five had an equal number of wins. The utter futility of using actual wins to predict winners is clear. Not only is this method inaccurate, but whole number wins often give no prediction due to frequent ties.

In fact, using the previous season’s performance is more accurate than using the current season’s performance. To determine the predicted winner for this column, I take the team that advanced farthest in the playoffs the previous year and use regular season wins from that year as a tiebreaker. These two steps generate a prediction in all but one game, which is already an improvement over current season wins. The predictions themselves are correct in 16 out of 25 games. My hunch is that prior playoff success provides more useful information than regular season success, since most playoff teams are pretty good (i.e., there is less variation in opponents’ strength).

So, what do we have on our hands this year? The Patriots come in with 12 regular season True Wins to the Giants 8 and 13 actual wins to the Giants 9. Last season, the Pats were 14-2 (eliminated in the second round) and the Giants were 10-6 (missed the  playoffs on a tiebreaker to the Eagles). The Patriots are favored by three points on most sports books, though the line has shifted closer to 2.5 over the last few days. All four statistics favor the Patriots, and all four have been wrong only twice in the last 25 years. The Patriots were involved in both of those games, of course, including the Giants big upset following the 2007 season.

My theory about this year’s Super Bowl is similar to Bill Simmons’s theory about overrating and underrating. Simmons argues that players and teams go through cycles of being overrated and underrated, and the Giants (when it comes to playing the Patriots) are on the overrated side of things right now. They beat them in the Super Bowl, they beat them again this year in the regular season, they are a really tough match up because their front four get pressure and the Patriots rely on the passing game. However, excluding that one match up advantage, the Patriots are a far better team from top to bottom. It’s time for things to return to equilibrium.

Lastly, a couple fun facts. Two Super Bowl losers got revenge and won the Super Bowl the next year in the 70s, but no one has done it since then. Also, there have been 18 different Super Bowl losers in the last 25, but only 14 distinct winners. Only the Patriots, Broncos, Steelers, Giants, and Packers have gone on to win a Super Bowl after losing one of the last 25, and each of them took at least five years after the loss to get the win. It looks like losing the Super Bowl is not the first step in building a dynasty. And, is anyone else as excited as me for Super Bowl L in four years??

Advertisements

4 responses to “Predicting the Super Bowl winner

  1. Awesome! I love when football fans geek out with statistics!

    Just for fun, I’m running a math/psychology Super Bowl Squares contest that you might enjoy. Check it out:
    https://docs.google.com/spreadsheet/viewform?hl=en_US&formkey=dDd6U0U5OTBTcWZ1R1FfZ29JVjF4Tnc6MQ#gid=0

  2. Tyler, how does simple scoring differential do against your True Wins measure in predicting game outcomes? To me it seems like True Wins has the same problem as actual wins — it jumps discretely at certain relative scores (0 for actual wins, -7 and +7 for True Wins). True Wins has 2 jumps instead of one, so it captures a little more information, but it’s not clear to me why you should stop at 2, or why the jumps should be at +/-7 instead of +/-3, etc. Scoring differential is just as easy to calculate and uses all of the information in the relative score, albeit in a simplistic way.

  3. Indeed Chris, True Wins are a bit ad hoc (with -7/+7 as the cutoffs). The correct thing is probably to run regressions with Super Bowl win/loss (or margin of victory) as the outcome variable and a polynomial in total points for and points against as the explanatory variable. The estimated coefficients would allow for prediction of future Super Bowls, though of course checking past accuracy is a little dumb with this method, since the regression generates the best least squares fit by definition, so it would surely do quite well.

    I checked the simple method you suggest — comparing total scoring differential from the regular season. It makes the same prediction as True Wins in all but three cases: it gets the Broncos win over the Packers right (True Wins were equal at 12) but misses the Giants over the Bills and the Steelers over the Seahawks, which True Wins gets right. So, scoring differential gets 19 out of 25 correct, just like the betting line.

    My intuition for using True Wins instead of point differential is that the exact margin of victory in a blowout probably isn’t that indicative of team quality, since effort (and personnel) on both sides often changes in a blowout. That said, there seems to be little difference between the two measures. I haven’t run this with Pythagorean expectations out of protest, but it seems like any measure that focuses more on point differential (including True Wins) is a substantial improvement over wins alone.

  4. Pingback: LOS LIIIIIINNNNNNKKKKSSSSSS!!!!!!!!!!!!!! | Causal Sports Fan

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s