Tag Archives: hardball times

You’re Wrong; No, You Are!

As I’ve mentioned before, the comments section of the Hardball Times is a barren wasteland. But Matt Swartz’s latest treatise on being an idiot with a stats software package attracted some controversy, mostly because he’s an idiot with a stats software package. I’ve archived the SABR fight in case the comments disappear as things sometimes do on THT. And I’ve even highlighted the best parts.

Mike Fast:

The community has shown with certainty that there is little difference between pitchers? I would say that my study of HITf/x data indicated exactly the opposite.

And similarly for team defensive efficiency, a large portion of it is due to how hard the team’s pitchers allow the ball to be hit.

Single-year BABIP is a crude measure of pitcher skill, and it’s leading you to conclusions about the game of baseball that are very wrong.

Matt Swartz:

I’m not coming to any wrong conclusions. I don’t know what you think I’m doing with single season BABIP, but it’s not leading myself to wrong conclusions.

There IS little difference relative to the difference between pitchers in strikeout rate, which is why it takes more than a season to stabilize.

What your study showed was that how hard balls are hit is persistent, and that it is correlated with BABIP. It didn’t widen the spread of pitcher BABIP skill levels in the MLB, which is and always has been minimal compared to the spread in strikeout rates.

I find your comment about “leading you to conclusions about the game of baseball that are very wrong” to be fantastically indicative that you haven’t really read and understood this or anything else I’ve written on the topic of pitcher BABIP. If you did, you could certainly understand your own findings better, and you’d know they aren’t contradictory.

The reason that single season BABIP is a crude indicator of pitcher skill is sample size. The variance in an individual’s BABIP skill level due to randomness is going to be about [.21/sqrt(number of batted balls)]. Knowing this, we can actually pin down that about 75% of single season BABIP variance is due to luck for pitchers with >=150 IP. The rest of it comes down to know the other 25%. We know that regressing team BABIP by the same process would yield another 13% of the variance in BABIP, which means that there is 12% for pitching.

Using single season BABIP to understand that 12% will due a pretty poor job. However, using peripherals and running a regression as I have will eliminate a lot of that noise. In fact, you can explain about 10.4% of that 12% by knowing peripherals. What your study likely did is duplicated some of the effort in understanding the first 10.4% (hard hit balls or correlated with peripherals; check your data, I’m sure it’s true) and supplemented a good portion of the remaining 1.6%.

In other words, nothing you found negates anything I’ve found at all. You’ve come up with a way to use propietary data effectively. Unless you have that available, using peripherals does a pretty good job. I can’t even imagine what it is that you disagree with here, or what you think I don’t understand.

MF:

I’m not disputing your statistics. I’m disputing your conclusions about the game of baseball.

“What your study showed was that how hard balls are hit is persistent, and that it is correlated with BABIP. It didn’t widen the spread of pitcher BABIP skill levels in the MLB, which is and always has been minimal compared to the spread in strikeout rates.”

Right. But I did show that BABIP is a poor way to measure pitcher skill. We sorta knew that already, but some people had taken the BABIP findings to mean that pitcher skill was also minimal. I established that that conclusion from the evidence was wrong.

You are correct that strikeout rate picks up some of the hard-hit ball skill that pitchers have. However, it does not pick up nearly all of it.

Moreover, batted ball categories are pretty good at picking up vertical launch angle effects, but they are lousy at picking up how hard the ball is it.

So your regressions are still missing some pretty important data.

Yes, the ways we have found to measure that data so far are proprietary. That doesn’t mean that we shouldn’t learn about the reality of baseball from that data and let that effect how we frame questions, though. I would certainly wonder why BABIP doesn’t better reflect how hard the ball is hit.

I found that almost half of team BABIP was due to how hard the ball was hit. So when you say it’s 12 percent pitching skill, that’s what I’m disputing. You could say that you can only detect that 12 percent of the team BABIP is due to the pitchers, but it’s a leap of logic to say that you’re looking at pitching SKILL there. And HITf/x data indicates in fact that you are not.

Also, I don’t understand why you insist on looking at single-season pitcher/team BABIP to determine that number. It is simpler to calculate, but it’s deceptive. Being rooted to single-season numbers is one of the big failings of modern sabermetrics.

MS:

Which of my conclusions about the game of baseball do you dispute?

You found that how hard a ball is hit is highly correlated. This is a self-contained statistic that is only useful inasmuch as it can teach you about singles, doubles, triples, home runs, outs, and errors. It doesn’t do me any good to know the statistic otherwise, except for how it relates to outcomes that affect games. So BABIP is a logical skill to try to infer from how hard a ball is hit, and your numbers do a nice job of hitting on that.

I think when you say “half of BABIP was due to how hard the ball was hit,” you’re either using same year data or R instead of R^2 or doing both. I’m guessing you’re doing correlations, while I’m doing R^2.

But if it’s just same year data, you’re including luck in terms of how hard a ball was hit (of course pitchers will deviate around their true talent rate in this category as well). That doesn’t measure skill. That measures outcomes.

My regressions are not intended to be the end-all summary of a pitcher’s true BABIP skill. They pick up about 80% of the possible variance that could exist in BABIP skills.

Since this seems to be a point of contention—how much variance in true BABIP skill there is to find—I’ll prove to you that R=0.5 or even R^2=.25 is insane for one season of data.

Take all pitchers with 150 IP or more in a single season from 2003-2011. They average 592 BIP. There true BABIP skill is about .30, give or take, so the variance in luck HAS to be .21/592 for the average pitcher in this group. It’s impossible binomially for that not to be true. That’s a random variance if .000354. The actual variance in BABIP for that same group is .000457. That means randomness HAS to explain 77% (last time I got 75% but same diff)! I don’t know how much you think is team defense, but you’re it’s not 0%. If you look at how much variance is explainable by defense seriously, it’s about 13%. That’s just regressing the data.

So my original 12% number is the maximum explainable by differences between pitchers. That’s not what my regressioun found. That was 10.4%. Obviously give or take here or there, but you get the point. Most of it is explained by peripherals.

And just because you’re saying I’m looking at single-season numbers to prove that point, that has nothing to do with the implications of that 12%. The 12% means the standard deviation is pitcher skill is about .007 of BABIP. It can’t be much greater than that, and it has nothing to do with choosing a single season. The same analysis on careers or half seasons or whatever would give you about the same conclusion. I look at single-season because it’s the easiest to run these tests on quickly.

So what exactly do you think are my wrong conclusions? Where in that description of variance will you determine that BABIP skill level has a higher spread than about .007, and where as about .005 or .006 can be explained by a regression on peripherals, tell me what’s wrong here. If you want to say there is value in the last .001 or .002, great, keep at it. It may only be attainable with propietary data, and good for you if you can use it to your advantage. But nothing that I have found here is wrong.

And there it ends, without even a snide remark from Fast on Twitter. I feel like Matt Swartz took Nate Silver’s Baseball Prospectus columns a little too much to heart.

Re-Explaining DIPS, Vol. 1

In this new series, I will highlight one of my least favorite SABR tropes, starting off an article by name-dropping Voros McCracken and explaining DIPS. First off, Matt Swartz in his recent, three-tabled article, “Adjusting defense efficiency by the quality of pitching“:

Fausto Carmona throws a hard sinker on the outside corner, but Ichiro Suzuki turns it into a well-struck ground ball by going the other way, splitting the defenders on the left side of the diamond. We know who should get credit for the single on the Mariners’ side of the box score—there was only one guy with a bat. But who on the Indians will take the blame for the single? Is it Carmona who made the pitch, or the defenders who could not get to the ball fast enough?

Bill James invented Defensive Efficiency, measuring the percentage of balls in play that a defense turns into outs. It became apparent just how useful this would be for evaluation of team defense when Voros McCracken famously concluded that, “There is little if any difference among major-league pitchers in their ability to prevent hits on balls hit in the field of play.” A natural corollary to this thesis says that to measure team defense, one should use Defensive Efficiency rate.

Another Post About TUCK!

At the risk of making this the first and only blog dedicated to TUCK! sez criticism, how could I pass up his latest gem? It might be his worst finest achievement to date.

My complaints:

  1. TUCK! thinks Josh Hamilton is the Rangers’ first baseman.
  2. TUCK! thinks members of the BBWAA vote for the Gold Gloves (they don’t)
  3. TUCK! thinks Felix Hernandez was undeserving of the AL Cy Young and Joey Votto was undeserving of the NL MVP.

And I think the backwards K lives on Asteroid 331.

The Hardball Times Commenters Exist!

SABR love arguing and they love the internet. So arguing on the internet comes naturally. You can find stupid arguments all over the comments of The Book, Baseball Prospectus, BBTF, and a bunch of crappy SB Nation blogs. But you couldn’t, until yesterday, on The Hardball Times. Before then, all you’d see were the occasional person calling out Tuck for being an idiot.

It took an astonishingly-bad article, but look for yourself–a genuine SABR fight. Here’s the play-by-play.

  • 10:29 a.m. – Bob Lee starts off by noting how stupid Paul Francis Sullivan is (13 words)
  • 14 minutes later – Brad Johnson comments (144 words)
  • 2 minutes later – Mark comments (39 words)
  • 4 minutes later – Mark comments again (83 words)
  • 2 minutes later – ecp comments (13 words)
  • 8 minutes later – Brad Johnson comments (119 words)
  • 3 minutes later – Mark comments (83)
  • 5 minutes later – Brad Johnson accuses Mark of being schizophrenic (12 words)
  • 3 minutes later – Mark reveals there is another Mark (31 words)
  • 45 minutes later – the second Mark makes himself known and promotes his theory that there are two Pirates organizations; how ironic (82 words)
  • 34 minutes later – Paul E comments (52 words)
  • 6 minutes later – Brad Johnson comments (112 words)
  • 8 minutes later – Mark comments (142 words)
  • 8 minutes later – Brad Johnson comments (143 words)
  • 16 minutes later – Mark finally responds (170 words)
  • 8 minutes later – Brad Johnson comments (152 words)
  • 7 minutes later – Mark breaks out the caps lock, twice (109 words)
  • 9 minutes later – Brad Johnson is cowed by Mark’s caps lock (88 words)
  • 4 minutes later – Mark gloats (122 words)
  • 9 minutes later – Brad Johnson continues writing for some reason (129 words)
  • 5 minutes later – the second Mark chimes in; he’s very sneaky (149 words)
  • 2 minutes later – original Mark breaks out the caps lock again to seal the victory (140 words)
  • 14 minutes later – confused because Brad Johnson hasn’t commented in over 20 minutes, Mark uses more caps lock (77 words)
  • 19 minutes later – Brad Johnson returns to defend his honor (98 words)
  • 9 minutes later – Mark repeats himself (124 words)
  • 10 minutes later – Brad Johnson repeats himself (126 words)
  • 1 minutes later – Mark openly mocks Brad Johnson (15 words)
  • 5 minutes later – Mark comes back for more, including caps lock (90 words)
  • 1 minute later – Paul Francis Sullivan wants something noted for the record, which I suppose would be this blog; so noted (11 words)
  • 3 minutes later – Brad Johnson accuses Mark of being contrary (129 words)
  • 19 minutes later – Mark continues to troll, compare Pedro Alvarez to his dog (51 words)

Since then, Brad Johnson hasn’t shown his face. He must be ashamed. Mark, for his part, has been proudly trumpeting his victory and taking on any new comers, to the tune of 9 posts and 1,741 words in the last 24 hours.

Tuck, the World’s Laziest Cartoonist

I don’t have the patience to write an introduction for this, since any SABR worth his slide rule knows all about the horrible comic TUCK! sez that appears regularly on Hardball Times and also knows any moment spent thinking about it is a waste of time. Tuck’s artwork is terrible, his stories make no sense, and his jokes are so bad they actually might be anti-comedy.

The worst of it might be Tuck’s sheer laziness. Who else would post the exact same comic six times this year only changing the speech bubbles? If you don’t believe me, here’s the complete listing of what I call  the “Dragon Ball Z Guy Reads the Paper While Watching TV and after the TV Says Something Stupid and Not Funny, He Shoots a Knowing, World-Weary Look to the Reader, also the Headline on the Paper Changes to an Even Worse Joke than the one the TV Made” series.

  1. November 14, 2007: Same as it ever was..!
  2. January 23, 2008: Going, going…still going…
  3. March 19, 2008: Whoulda thunk, indeed?
  4. June 11, 2008: Yeah, right…
  5. August 23, 2008: All part of the game, right, Lefebvre?
  6. August 30, 2008: Glavine-san???
  7. September 10, 2008: They have these new things called “maps”
  8. October 18, 2008: 2008 Playoffs Sketchbook, Part Five
  9. November 8, 2008: TUCK!’s 2008 Playoffs Sketchbook, part (last)
  10. November 19, 2008: File under “Duh”
  11. December 17, 2008: Media blitzed
  12. January 28, 2009: Cha-ching, 2K9
  13. March 5, 2009: Fascinating impersonating
  14. April 2, 2009: no foolin’!
  15. May 14, 2009: One for Calcaterra
  16. August 20, 2009: Play ball. Please!
  17. November 12, 2009: Winner gets the centaur paintings?
  18. January 11, 2010: How much is that per strikeout?
  19. March 25, 2010: Another one for Calcaterra
  20. April 22, 2010: That’s a lot of cabbage
  21. June 14, 2010: Strasburg Week, Pt 1
  22. August 19, 2010: There goes my BBWAA membership
  23. October 14, 2010: Next?

There’s almost no difference from one edition to the next. The biggest change is when Dragon Ball Z Guy’s cigar-smoking dog, Si-Si, first appears on May 14, 2009. Oh and the guy’s hat appears to say “FAN·O,” presumably because he’s a sports fan. Very clever. Anyway, that’s the only change in format between 2007 and now. This is the kind of hard-hitting research you’ll only see at Praiseball Bospectus.

Beyond the stupidity of the jokes or even the stupidity of the whole setup and payoff, is the stupidity of Tuck himself. I don’t think he ever bothered to save a template of the comic, even though he’s reused it nearly two dozen times now. He just scans in a copy of a copy of a copy each time. Comparing the older comics with the more recent, the quality and pixelization definitely has gotten worse over the years. Eventually, we’ll be reading a gray box with two speech bubbles. I don’t know if gray boxes can give withering looks, though.

In the end, this won’t change anyone’s opinion on TUCK! sez. It was total crap to begin with, so what difference does it make if he starts copying himself? Excuse me while I sulk for having spent an hour on this. Next time, though, I’ll give Tuck the Tangotiger treatment, as I investigate the “awards” he’s won.

Update: I don’t why I didn’t do this before. Enjoy this slideshow of the comics. It makes it especially easy to tell that Tuck rescans the comic each time, as the orientation and size never stays the same.

This slideshow requires JavaScript.

Scouts, Statisticians, and Bad SABR

Finally, someone explains the whole stats vs. scouts issue. It all makes sense now!

Today in TUCK! sez

TUCK! sez: And somewhere, Robin Ventura shakes his head

This site is a poor imitation of one great blog already, so I’ll refrain from explaining today’s extremely stupid TUCK! sez comic on the Hardball Times, out of respect to Joe Mathlete. Nevertheless, I can’t comprehend why Tuck would suggest that Nolan Ryan or anyone involved with the Rangers auction at all wasn’t aware of the team’s debt obligations. Does Tuck really think he’s the first person to realize that the new owners will have to take care of that debt?

Tuck consistently produces the worst comics I’ve ever seen, but in terms of sheer stupidity of concept–not to mention execution–this one is in a class of its own.

I’ll indulge myself and touch on execution briefly. These are my issues: (1) Is it actually funnier to say “US Bankruptcy Court -n- Auction House” [sic] than “US Bankruptcy Court”? (2) Why is the American judge wearing a judge’s wig? (3) Does the empty speech bubble represent that Ryan is lost for words or did Tuck just forget to fill that one in? (4) How is the judge’s disembodied right arm coming from a mystery curl that isn’t attached the rest of the wig? (5) Is Judge Nelms’s left hand really just a giant thumb? (6) And why is he so evil?