Do defensive stats tell us anything worth knowing? Does having a defensive rating for players help us predict which teams will be most successful in keeping hits and runs off the scoreboard? That is a question that has never been answered, and one that I will try to explore. Previously, people have looked at how well defensive metrics correlate from year to year with themselves, and how well they correlate to other metrics. Going a bit deeper, people have shown that the metrics do not correlate as well for players who switch teams as they do for players who stay put. But what we really should want to know is how useful they are in predicting which teams keep hits and runs off the scoreboard.
The standard has to be team defensive efficiency record, a stat invented by Bill James that tells us simply how often a batted ball is turned into an out. This stat is not the sole responsibility of the fielders, ballparks have a big effect on it, as do the types of batted balls allowed by pitchers. Line drives usually become hits, popups are almost always outs. This also has to be accounted for before we can attribute DER to fielders.
What I have done is take simple defensive projections of 4 systems, UZR and PM (John Dewan's) which are found on Fangraphs, my own TotalZone, and Zone rating runs, using the method of Chris Dial, and data from the excellent replacement level Yankees website. The simple projection is a marcel-type where I weight 2008 as 5, 2007 as 4, 2006 as 3, and add in 1500 chances of league average for each player. Since 2008 has a weight of 5, a player who got 300 chances in 2008 and nothing before that will have his rating regressed 50% to the mean. These simple projections are prorated to the actual number of defensive innings for each team in the 2009 season. Finally, the projected team defensive ratings are compared to a park and BIP adjusted figure for team runs, based on their DER.
The results are disappointing. The correlation coefficients average only .02. I won't tell you which system wins yet for a few reasons. TotalZone is neither first nor last among the 4.
The first reason not to publish the results is that there is more work to do. I think I need to design a more robust projection system for this, accounting for data at multiple positions. For example, Franklin Gutierrez has a rating of close to +20 for 2006-2008, but as a right fielder since that is where he mostly played for Cleveland. He played center last year for Seattle and was the defensive star of 2009. But without data in center, right now I have all the projections expecting him to be average. There is a simple solution, use the position adjustments to derive a CF projection from his RF data, but that will take a little time to program in and do it for all 4 defensive systems.
While I work on the data and towards a more robust test, I will invite anyone else who does defensive ratings to join the test. You'll just need to provide the data for 2006 to 2008, along with a MLBAM ID for each player. The format needed is like this, an example used is Adam Everett 2006:
I'll add any system to the mix as long as you provide me the data in the proper format. In particular, I'd love to have Dave Pinto's PMR, Shane Jensen's safe, Peter Jensen's Big Zone, and Tango Tiger's Fan Scouting report, if the creators of those systems are willing.