July 19th, 2013

The Big Hurst - Jeremy Hefner

skoormit - Mike Leake

La Osa Rosa - Francisco Liriano

The Big Hurst: Over the last several years, I've tinkered with a mathematical model to predict the winner of the World Series.  It's founded on a method borrowed from the 1984 Bill James Baseball Abstract.  Originally finished in 2011, I think this formula would've been over 90% accurate from 1985-2011.  It gets every World Series right from 2011 back to 2007.  It was wrong in 2006.  Then it gets 2005 and 2004 right.  And was wrong again in 2003.  Then, it correctly predicted every World Series from 2002 all the way back to 1985.  So in 26 opportunities, it got 24 right.   The 1984 Bill James formula was something like 70% accurate and he considered it successful enough to talk about it on the Today show (where his prediction that year was apparently wrong).  Even if you just use 2002-2011, I think it was 80% accurate.

The 2011 formula was:
+15 to the team with home field advantage
+7 to the team that had more runs scored AGAINST it
+7 to the team with more pitcher strikeouts
+7 to the team with FEWER wins in the regular season
+6 to the team with a better hitter OPS
+5 to the team with the bigger payroll

Never laugh at the All Star Game promotion of "This Time It Counts."  I think they might have known something I didn't.  Notably, the team with the home field advantage (the factor with the most weight) didn't win in 2008, 1999, and 1992, but the system predicted these years right anyway.  The only team to sweep the system was the 2004 Red Sox.  They had every factor in their favor that year.

But Murphy's Law, the system was wrong for the 2012 World Series.  It predicted the Tigers, but they got swept in 4 games.  Viewed through the lens of the system, 2012 felt a lot like 1992.  Almost the only factor the 1992 Braves had going for them over the 1992 Blue Jays was home field advantage.  In 2012, the only factor the Giants had was home field.  But the 1992 Braves lost and the 2012 Giants won.

After 2012, I refigured the formula:
+16 Home Field Advantage
+6 More Runs AGAINST
+6 More Pitcher Ks
+5 More Hitter OPS
+4 More Hitter Doubles
+4 FEWER Team Wins
+4 More Payroll

I think the success rate for this system is 85%. Because payroll numbers are problematic past 1985, I only figured from then, and it's 23 of 27 from 1985-2012. It misses in 2012, 2006, 2003, and (now) 1999, but still gets every year right from 1985-1998. It's also still 70% accurate over the last 10 years.

I don't know whether this formula is descriptive or predictive.  Causation is a bear.

Looking at this formula, the real question is why bother to check anything at all past home field advantage?  Over 27 years, I think home field advantage predicts the winner in all but 5 cases (2008, 2006, 2003, 1999, and 1992).  That's 81% accuracy, without even considering anything else.  Am I missing the obvious forest for all the monkeying around with the trees?  All this other junk improves the accuracy by maybe something like 4%.  And in baseball, 4% worth of fuzziness may essentially be nothing.  Would the smart money in 2012 have been on the Giants, based purely on home field advantage?

I honestly don't know if this is worth a hoot, but I thought I'd throw it up on the web.  Though the Cardinals seem like they might be the strongest bet at the moment to win it all, I think my early money would be on the Red Sox or Tigers.


The Big Hurst: Not an auspicious start to the back half for us, and give me the Wooden Egg of Infinite Sadness Award today.  Because it makes a fun story, I can reveal that I had my selection down to Gaudin or Hefner.  I seriously agonized between those two guys.  In the end, I didn't pick Gaudin because he's not a full-time starter.  I clearly chose the wrong guy.  Rats.  This game is maddening.


