Tag Archives: comments war

Kyle Boddy in a Nutshell

This isn’t the grand return I would’ve hoped for–and I’m a little behind on reporting this–but over at The Book blog, Tango asks readers to leave a comment with their feelings about FanGraphs (mine are already known). Kyle Boddy dips his toes into the water, saying:

This is a great example of an article that is very misleading:

http://www.fangraphs.com/blogs/index.php/vegas-still-isnt-buying-the-orioles/

The author has no experience in gambling markets, and the commentators are even worse. 99.9% of what is written here is pure drivel and/or speculation.

The internet in a nutshell responds:

The guy who has spent years putting his name on discredited and ridiculous articles about pitching mechanics making summary judgments without any supporting evidence of someone else’s work. When asked for supporting evidence on Twitter, you say you don’t have time to offer up any kind of reasoning for your claims, so you link to a blog post that offers no reasoning for your claims.

You are exactly the kind of critic that Tango is rightfully railing against.

Anyone who has read Praiseball Bospectus before should know how I feel about that. Care to respond, Kyle?

Talk about irony, anonymous user.

I’ll allow myself this satisfaction: Will I be seeing you at the multiple conventions I’m being paid to speak at this year about training pitchers? Or will you be sitting in on the discussions I’ve had with front office executives?

I don’t think he understands irony, but whatever. Someone show me a front office talking to Kyle Boddy about mechanics and I’ll show you a, uh… front office that is doing a bad job? I’m a little rusty with my zingers, I’m no “internet in a nutshell”.

If all it takes to become a mechanics expert that people pay attention to is to write confusing articles on the Hardball Times, I’ve clearly taken the wrong path in life. I’ll expand on this topic in my next post, “How Dylan Bundy’s Kinetic Load Increases Torque and Humeral Rotation”.

You’re Wrong; No, You Are!

As I’ve mentioned before, the comments section of the Hardball Times is a barren wasteland. But Matt Swartz’s latest treatise on being an idiot with a stats software package attracted some controversy, mostly because he’s an idiot with a stats software package. I’ve archived the SABR fight in case the comments disappear as things sometimes do on THT. And I’ve even highlighted the best parts.

Mike Fast:

The community has shown with certainty that there is little difference between pitchers? I would say that my study of HITf/x data indicated exactly the opposite.

And similarly for team defensive efficiency, a large portion of it is due to how hard the team’s pitchers allow the ball to be hit.

Single-year BABIP is a crude measure of pitcher skill, and it’s leading you to conclusions about the game of baseball that are very wrong.

Matt Swartz:

I’m not coming to any wrong conclusions. I don’t know what you think I’m doing with single season BABIP, but it’s not leading myself to wrong conclusions.

There IS little difference relative to the difference between pitchers in strikeout rate, which is why it takes more than a season to stabilize.

What your study showed was that how hard balls are hit is persistent, and that it is correlated with BABIP. It didn’t widen the spread of pitcher BABIP skill levels in the MLB, which is and always has been minimal compared to the spread in strikeout rates.

I find your comment about “leading you to conclusions about the game of baseball that are very wrong” to be fantastically indicative that you haven’t really read and understood this or anything else I’ve written on the topic of pitcher BABIP. If you did, you could certainly understand your own findings better, and you’d know they aren’t contradictory.

The reason that single season BABIP is a crude indicator of pitcher skill is sample size. The variance in an individual’s BABIP skill level due to randomness is going to be about [.21/sqrt(number of batted balls)]. Knowing this, we can actually pin down that about 75% of single season BABIP variance is due to luck for pitchers with >=150 IP. The rest of it comes down to know the other 25%. We know that regressing team BABIP by the same process would yield another 13% of the variance in BABIP, which means that there is 12% for pitching.

Using single season BABIP to understand that 12% will due a pretty poor job. However, using peripherals and running a regression as I have will eliminate a lot of that noise. In fact, you can explain about 10.4% of that 12% by knowing peripherals. What your study likely did is duplicated some of the effort in understanding the first 10.4% (hard hit balls or correlated with peripherals; check your data, I’m sure it’s true) and supplemented a good portion of the remaining 1.6%.

In other words, nothing you found negates anything I’ve found at all. You’ve come up with a way to use propietary data effectively. Unless you have that available, using peripherals does a pretty good job. I can’t even imagine what it is that you disagree with here, or what you think I don’t understand.

MF:

I’m not disputing your statistics. I’m disputing your conclusions about the game of baseball.

“What your study showed was that how hard balls are hit is persistent, and that it is correlated with BABIP. It didn’t widen the spread of pitcher BABIP skill levels in the MLB, which is and always has been minimal compared to the spread in strikeout rates.”

Right. But I did show that BABIP is a poor way to measure pitcher skill. We sorta knew that already, but some people had taken the BABIP findings to mean that pitcher skill was also minimal. I established that that conclusion from the evidence was wrong.

You are correct that strikeout rate picks up some of the hard-hit ball skill that pitchers have. However, it does not pick up nearly all of it.

Moreover, batted ball categories are pretty good at picking up vertical launch angle effects, but they are lousy at picking up how hard the ball is it.

So your regressions are still missing some pretty important data.

Yes, the ways we have found to measure that data so far are proprietary. That doesn’t mean that we shouldn’t learn about the reality of baseball from that data and let that effect how we frame questions, though. I would certainly wonder why BABIP doesn’t better reflect how hard the ball is hit.

I found that almost half of team BABIP was due to how hard the ball was hit. So when you say it’s 12 percent pitching skill, that’s what I’m disputing. You could say that you can only detect that 12 percent of the team BABIP is due to the pitchers, but it’s a leap of logic to say that you’re looking at pitching SKILL there. And HITf/x data indicates in fact that you are not.

Also, I don’t understand why you insist on looking at single-season pitcher/team BABIP to determine that number. It is simpler to calculate, but it’s deceptive. Being rooted to single-season numbers is one of the big failings of modern sabermetrics.

MS:

Which of my conclusions about the game of baseball do you dispute?

You found that how hard a ball is hit is highly correlated. This is a self-contained statistic that is only useful inasmuch as it can teach you about singles, doubles, triples, home runs, outs, and errors. It doesn’t do me any good to know the statistic otherwise, except for how it relates to outcomes that affect games. So BABIP is a logical skill to try to infer from how hard a ball is hit, and your numbers do a nice job of hitting on that.

I think when you say “half of BABIP was due to how hard the ball was hit,” you’re either using same year data or R instead of R^2 or doing both. I’m guessing you’re doing correlations, while I’m doing R^2.

But if it’s just same year data, you’re including luck in terms of how hard a ball was hit (of course pitchers will deviate around their true talent rate in this category as well). That doesn’t measure skill. That measures outcomes.

My regressions are not intended to be the end-all summary of a pitcher’s true BABIP skill. They pick up about 80% of the possible variance that could exist in BABIP skills.

Since this seems to be a point of contention—how much variance in true BABIP skill there is to find—I’ll prove to you that R=0.5 or even R^2=.25 is insane for one season of data.

Take all pitchers with 150 IP or more in a single season from 2003-2011. They average 592 BIP. There true BABIP skill is about .30, give or take, so the variance in luck HAS to be .21/592 for the average pitcher in this group. It’s impossible binomially for that not to be true. That’s a random variance if .000354. The actual variance in BABIP for that same group is .000457. That means randomness HAS to explain 77% (last time I got 75% but same diff)! I don’t know how much you think is team defense, but you’re it’s not 0%. If you look at how much variance is explainable by defense seriously, it’s about 13%. That’s just regressing the data.

So my original 12% number is the maximum explainable by differences between pitchers. That’s not what my regressioun found. That was 10.4%. Obviously give or take here or there, but you get the point. Most of it is explained by peripherals.

And just because you’re saying I’m looking at single-season numbers to prove that point, that has nothing to do with the implications of that 12%. The 12% means the standard deviation is pitcher skill is about .007 of BABIP. It can’t be much greater than that, and it has nothing to do with choosing a single season. The same analysis on careers or half seasons or whatever would give you about the same conclusion. I look at single-season because it’s the easiest to run these tests on quickly.

So what exactly do you think are my wrong conclusions? Where in that description of variance will you determine that BABIP skill level has a higher spread than about .007, and where as about .005 or .006 can be explained by a regression on peripherals, tell me what’s wrong here. If you want to say there is value in the last .001 or .002, great, keep at it. It may only be attainable with propietary data, and good for you if you can use it to your advantage. But nothing that I have found here is wrong.

And there it ends, without even a snide remark from Fast on Twitter. I feel like Matt Swartz took Nate Silver’s Baseball Prospectus columns a little too much to heart.

The Best of MGL, October 2011

Apparently, Major League Baseball had their postseason last month. I was travelling in Treasure Island, Ontario for work and without internet access, so I missed it. A lot of SABR drama flared up while I was gone–too much for me to properly deal with here, unfortunately. The best I can do is publish this compilation of MGL’s best comments from October, 2011, but without any of the snappy backtalk you’ve come to expect from Praiseball Bospectus.

They say that legends are born in October, but only one man can reign over it. MGL easily surpassed 40,000 words (or enough for a short novel) written in comment threads last month. And I’m sure there are even more great MGLian comments not included here that I missed (especially if not posted on The Book Blog). I’d like to add them if you’d be kind enough to post a link in the comments. So, please help me out.

Without further ado, I present the SABR who never sleeps, MGL.

Terrible managing?, October 5, 12:31 AM:

My thoughts are this:

If I could give a manager a piece of paper with the answer to all of these decisions, I would be correct 90%+ of the time and a lot more than the manager would. A lot. I would miss some of the intangibles for sure, but those would pale in comparison to the “numbers” behind my decisions. I would add at least one win a team’s WE, thus I should be paid 5 mil or more…

Worst managing ever?, October 8, 12:17 AM:

#5 and #6, and because you think it, that makes it right? You want to accept that bet also?

I’ll bet you won’t accept that bet. That is because when people who have little expertise on a matter have an opinion on that matter and those opinions are not supported by evidence, they never take those bets. I wonder why?

You see, anything that I posture on this blog, I will always stand by it, because it is almost always based on evidence or my experience, knowledge, or expertise which has been gleaned by evidence. I learned a long time ago that my opinions without a solid base aren’t worth jack…

Worst managing ever?, October 8, 8:00 PM:

Wow, nice folks on BTF. I wonder why I left that site years ago.

What is incredible is how many people there use the result of the plays, the game, and the series to refute my claims.

I wasn’t aware that someone took me up on my bet and then they got to choose their methodology, players, etc.

Before someone takes me up on one of my bets, please do the following:

  1. Say, “I accept your wager.”
  2. Identify yourself and let us know how you intend to pay up if you lose.
  3. Figure out a way to choose a third party to verify or conduct any research that is needed.

If I lose I will gladly pay up and learn something in the process. Someone (who has no idea who I am) on BTF actually said something to the effect, “…because the dickhead never pays up…”

I am doing some research now on the 9th inning thing…

How do great starting pitchers pitch the 4th time through the order?, October 9, 1:24 AM:

So, in the original thread, did I say anything egregious or is the entire post egregious enough to warrant the vitriol on BBTF and even here?

I made the point about Punto, which I think is correct, at least according to my sim, and a 1.5% WE is pretty big. And I have not heard anyone refute that with evidence other than stupid batter/pitcher matchups which we have already discussed (and hopefully put to rest), or the proverbial, “The manager must know something that you don’t know.”

I made the point about the two bunts, which I think are correct, and, again, I have not heard any refutation with evidence. A few people excoriated me for saying that Carp would have been safe anyway, which is debatable (or not) but completely irrelevant to my argument. I simply said that with him running, the bunt win expectancy is going to be very poor compared to the WE from batting. That is because a bunt is always marginal. Throw in a poor/slow runner on the bases such that he is going to get forced a significant percentage of the time, and the bunt is not likely to be correct, even against a very good pitcher. Any arguments there?

I said that bunting with a 2-0 count, when the bunt at the outset of the PA was probably bad, is an egregious error, and I am pretty confident it is, and I have not heard any refutations on that either. Problems with that?

I said that not pinch hitting for Carp in the 8th was bad, but I did not harp on that. I still think the numbers will show that was bad. I admit that probably no manager would have done that – although that does not make it correct. I am strictly speaking of mathematically correct things, and not what would make the manager look good or bad. That is not my job to determine that.

And finally (the 4th or 5th thing I criticized), I said that bringing in Motte was correct. It looks like that may be a tossup, but with Carp not being a top tier starter (according to my projections and others – see ZIPS, Oliver, Pecota, Steamer, etc.) and with Motte being a very good closer, I think that bringing in Motte IS the correct choice, but perhaps marginally so. Again, whether a manager “should” do that is not my business. I am talking strictly numbers.

So why the universal hate, condemnation, criticism, mockery, etc.?

Someone please explain what I did to deserve that? And I am taking about substance and not tone. If someone wants to criticize me for my tone, so be it. I don’t give a hoot about that. Those are all ad hominem arguments anyway for people who sadly have nothing substantive to contribute…

How do great starting pitchers pitch the 4th time through the order?, October 9, 11:09 AM:

It should be fairly easy to figure out what is happening in the 9th, but I can’t do that now.

McCoy, you can do all the speculation and “thinking” you want, and you might be right about conclusions, but without any numbers, they are meaningless. Sorry. Either one strategy or the other yields a greater win expectancy, or it is close in which case I have no problem yielding to the gut, experience, instinct, etc. But, unfortunately, you can’t figure out the answer without “running the numbers.”

How not to do a study, October 10, 1:24 AM:

Tango, why do you bother? You are infinitely more patient and kind than I am!

Worst managing ever?, October 10, 6:53 PM:

What I find especially outrageous, almost scandalous, is that someone could actually write, as in #72, that I said, “TLR is a terrible manager because he let Carpenter pitch the 9th, ” when the title of the post was hyperbolic, as titles or headlines often are, and that the last example of several gave of Tony’s mistakes was leaving Carp in to pitch the 9th. I suppose a loose characterization of my post is, “TLR is a terrible manager because (insert anything I happened to mention in the post),” but in my opinion that is classic spin, mischaracterization, taking words out of context, etc in order to launch an ad hominem attack and deflect attention from the issues at hand and is otherwise totally uncalled for.

Worst managing ever?, October 11, 9:37 AM:

Yes, I may have been wrong about the difference between an elite starter in the 9th who has been pitching extremely well and a closer. Can we put that to rest and move on?

Regardless of how assured I make myself out to be on various issues, I am sometimes wrong, thankfully. Let’s all have a party, hand out awards to everyone who doubted me, and move on…

Starting Pitcher on a great day v Closer, October 11, 8:07 PM:

Circle, your “English” explanation is useless. Do you really think that manager’s “experience, intuition, and expertise” can figure out the right answer? If you do, then you are dreaming. I mean managers make ridiculous, silly mistakes all the time, believing in erroneous things like the hot hand, and batter/pitcher matchups, and other small sample nonsense. Do you think that they magically become genius savants when it comes time to figure out when to take out their starter and when to bring in a reliever.

I certainly agree that if it is a tossup, you probably want your starter in order to save your reliever for a possible extra inning game or for tomorrow.

But, if your starting pitcher comes to bat late in a close game, and it is not an obvious bunt situation (with no outs), and especially with runners on base or leading off an inning, since the pitching aspect alone is a tossup (presumably), then it is a no-brainer as far as pinch hitting is concerned since there is always a large difference in WE and RE between a pitcher hitting and a pinch hitter hitting in a high leverage situation.

All the nonsense about, “The manager believes that the starter can shut them down, and the other team is not making good contact, and the team is ahead in the game anyway,” is just that – nonsense. All those “English” explanations will not help to facilitate the right answers in any way, shape or form.

Anyway, I have not read Max’s article yet, but I agree that there needs to be a lot more investigation and controlling of all the variables before we declare it a tossup…

Starters and Relievers in the 9th Inning and Score Differential, October 17, 3:48 AM:

I hope that someone links to this study in BBTF.

As I said, I love to be proven wrong, since that gives me an opportunity to learn something. However, after all the bashing I incurred at BBTF, I think it only fair for them to see this new research. Not that I wouldn’t get bashed again. After all, it is BBTF.

As well, I think I learned a lot from this research, even though in the end, I think I was vindicated, which is kind of a silly notion anyway. As I have always said, and Guy put it aptly, my opinions are almost always informed. Sometimes they are specifically supported by evidence and sometimes they are not. They are never, however, out of my a**, as most “lay” opinions on sports are. After all, I am an expert in the field of sports analysis. You would think I was a lay journalist opining on sabermetrics, like Jayson Stark, Buster Olney, or Murray Chass, if you read the comments on BBTF.

I also encourage other people to do similar research. For example, why is it that wOBA is so much higher when the game is close? I’m sure we can speculate, but without looking at the components and perhaps even the pitch f/x data, I don’t think we can be too sure of anything in that regard.

I would also like to see how pitching with runners on base comes into play. The starters obviously always started the 9th, but the reliever data I looked at was anytime during the inning. It could be when they started the 9th or when they came in in the middle of the inning, often with runners on base.

As well, although I didn’t adjust for platoon issues, the relievers definitely faced more same-handed batters, especially when the pitching team was losing, suggesting that they were brought in specifically to face same-handed batters at some point in the inning. This needs to be looked at too.

So I don’t think that the story is over, although I think we found a very significant factor that was affecting the data in the prior research…

Starters and Relievers in the 9th Inning and Score Differential, October 17, 11:56 AM:

DavidS, because of the small samples in each category for the starters, it is almost inevitable that there will NOT be a smooth transition and pattern, which is one reason why I broke it down into only 2 categories at the end.

#5, is the Don Malcomb of Big Bad Baseball? My, you’ve come a long way down. Do you enjoy attacking me for little reason? Is it professional jealousy?

As always, if you don’t have anything substantive, intelligent, or otherwise valuable to say in my house, I’d prefer you stay out…

Starters and Relievers in the 9th Inning and Score Differential, October 17, 1:00 PM:

5 out of 9 continuing to mock me while adding nothing to the discussion.

And one with this:

“I don’t need to see any research to show me that good starting pitchers who have pitched strongly and efficiently through the first eight innings should be kept in to try and complete the game. I suspect that almost every serious baseball fan in America knows this instinctively.”

What a pathetic excuse for a serious blog…

Manager mistakes in the 2011 WS: Game 1, October 19, 8:23 PM:

And if anyone makes any reference to the outcome or result of a certain strategy in terms of evaluating or even mentioning it (as a mistake), you are going to be banned for life!

You have no idea how many people in BBTF, said something like, “You are an idiot for suggesting that Carpenter should not have bat in the 8th or pitch in the 9th,” because he got a hit and retired the side in the 9th. Of course calling me an idiot on BBTF is nothing new…

Manager mistakes in the 2011 WS: Game 1, October 19, 9:27 PM:

I was curious about Jay batting second. Andrus batted second all season long. Surely not a good choice but a least it was not an example of a manager “doing something different” for a bad reason, which happens all the time, because they , I guess to show that they actually have a difficult job (in terms of lineups and in-game managing), which they don’t, in my opinion. I think I can train a 12 year old to manager a baseball game. Oh, wait, I forgot about the “double switch” in the NL. Too tricky for a 12 year old…

Manager mistakes in the 2011 WS: Game 1, October 19, 9:45 PM:

Some of you guys are going to be surprised at actually how many mistakes a manager can make in a game or series, especially in the post-season when managers think they have to do “something.” I have been mentally noting these mistakes for 25 years. Maybe that is why I am so ornery…

Manager mistakes in the 2011 WS: Game 1, October 20, 12:12 AM:

“Carpenter wasn’t pitching that great to fear him for the 7th.”

I guess you have not read all the work I did in the other thread! You still think that you can tell whether a pitcher is “pitching well” enough to continue. Managers can’t do that. You should be a manager!

Right, hitters occasionally do that to fool the umpire, but in this case there was no way that was acting.

“Pujols has been IBB’d 4 times in the playoffs. All 4 times the inning has ended with Matt Holliday and no runs scored.

Regardless of the good/bad decision aspect to it, it will likely continue until Holliday makes them pay.”

That is true. That is one reason that managers make so many bad decision…

A player after my own heart!, October 20, 8:53 PM:

Circle, you can use things like that as tie-breakers. Other than that, I’m tired of responding, with all due respect.

Repeat after me:

“Opinions without evidence…

Opinions without evidence….

Opinions without evidence….”

Please spare us.

I don’t care how much knowledge and experience you have.

Here is an MGL’ism along the lines of the Zen-master:

“Experience is often the enemy of the truth…”

A player after my own heart!, October 20, 9:00 PM:

Circle, you’re everything that a manager is, which is not necessarily a bad thing. SOME of your insight, knowledge, and experience is valuable on this blog.

However, a big part of sabermetrics, at least a vestige of it, is to show all the things that managers (and most people in general) believe that are simply not true. Rather than keep digging your heels in, learn something. You must be on this site for a reason other than to keep telling us that conventional wisdom is right and we are wrong. Or, more along the lines of your tone, “Yeah, you guys might be right, but…”

Your “buts” HAVE TO HAVE EVIDENCE for anyone to take them seriously. If you don’t have the expertise to do the research, then cite research from someone who does. If not, and I mean this teasingly, “Shuddupp!”

You say, “If it were me, I would pitch him at home. The splits are so large they must mean something.”

Evidence? No!

Your gonna respond positively, as you always do, and I appreciate that, but save it. Then 10 minutes later, you’re going to say, “Yeah, but…”

How about, “I think the world is flat and that we, as human beings, were spawned by aliens who landed here a long time ago. And we never landed on the moon and 9/11 was a conspiracy by the U.S. government?”

These are opinions same as yours. Evidence? Nah. Don’t pay them any mind and I won’t pay yours. Deal?

Manager moves in the 2011 WS: Game 2, October 20, 10:14 PM:

Tango, #6, I never thought it was that difficult to plot a pitch by eye, despite what some of the pitch f/x guys say. Especially if you watch a lot of games and you mentally adjust for the slightly offset camera angle (some broadcasts are more centered than others)…

Manager moves in the 2011 WS: Game 2, October 20, 10:34 PM:

I thought Moreland was traded to the Cardinals before the game started?

I didn’t really watch Pujols on that fly ball, but of course he should be running it out in the freaking WS! He is one of those guys that almost never runs out balls that are likely to be outs. I don’t care how good he is or how much money he makes. If I am the manager, he either runs them out or doesn’t play. If nothing else, it is a poor model for the younger guys.

So why was Napoli batting so low early in the season? He was an excellent hitter going into the season…

Manager moves in the 2011 WS: Game 2, October 20, 10:48 PM:

It is a freaking hit and a base advance on the throw. If there is an error it is on the RF’er for a bad throw

Manager moves in the 2011 WS: Game 2, October 20, 11:02 PM:

Westbrook is a terrible starter, but as a reliever, maybe he is 1 run better. Only TLR knows that. As you said, Phil, you gotta choose the guy who has the most K, you don’t mind walks. You want a guy who misses bats.

You bring in a high K/BB guy, and if Young walks, you bring in Westbrook, the sinker-baller.

Ah, I should have been a manager or pitching coach!

Remember I said, There would probably be a mistake before the game ends. There were maybe 5 mistakes.

Circle, that throw was terrible. Probably 15 feet off line. I don’t even think that Pujols touched the ball…

Manager moves in the 2011 WS: Game 2, October 20, 11:18 PM:

The media is so stupid and predictable…

Manager mistakes in the 2011 WS: Game 1, October 20, 11:22 PM:

Not thin air. Nothing I say is out of thin air. You can disagree of course. But everything I say is based on 25 years of sabermetric research, mine and others’.

I have Feldman projected as a 4.32 starter (average starter is around 4.08), which is actually not that bad. I take back what I said about him being “terrible”, although I do think he is worse than that as a starter based on what I have seen of him (but I don’t intend anyone to take that seriously) and ZIPS, Oliver, Steamer, and Pecota have him as around a 4.62 which is very poor. So the concensus is probably that he is a mediocre starter at best.

As a reliever, we usually just subtract around 1 run per 9, although I subtract .82 (from the research I have done). And my projection is based on his starts and relief appearances, with each one adjusted.

So that is my “thin air…”

Manager moves in the 2011 WS: Game 2, October 21, 12:27 AM:

Thanks! With that attitude, you would never be allowed on BTTF (Bash The “The Book” Fan”.

BTW, swapping Napoli with Young is more advantageous than swapping Andrus with Beltre. The former generates an extra 7 runs per season, while the latter only adds .5 run per season. So maybe having Andrus at #2 is not all that bad.

If we just switch Andrus with Napoli though (G-d forbid we bat a slow power hitter in the 2 hole), and kill 2 birds with one small stone, we get 12 extra runs a game or 1.6 wins a year…

Manager moves in the 2011 WS: Game 2, October 21, 12:06 PM:

I’m confused. We (saberists) get criticized all the time for not taking into consideration (including in our models/formulas) things like, “Hammy is hurt and may not be able to turn or catch up to Motte’s fastball.”

Yet, LaRussa, arguably one of the best at seeing and utilizing things like that, takes Motte out.

Ah, yesterday, he was a genius and hero, today, he is a goat…

Taking Out Jason Motte, October 21, 8:06 PM:

Wow. Without actually applying some NUMBERS to each of those options, it is impossible to know which one is correct (yields the highest WP for the Cards). Each of us can have an “opinion” on which one is correct, but without NUMBERS, I am afraid that opinion isn’t worth much.

I’ve been doing these kinds of analyses for 25 years and I have no idea which one is correct. I suspect that D might be, but you/I would have to figure out how much the bad “D” in the field costs in WP as well as removing another player (decent chance for a tie game and extra innings). It is really complicated to figure all this out, but it can be done (approximated at least, so we have SOME idea as to which option might have been correct).

I am completely agnostic as far as walking the bases loaded. Normally you never do that with 0 outs (other than perhaps in the bottom of the 9th in a tie game), but here, I don’t know. I don’t think (no, I KNOW) that Dave or anyone else knows without “running the numbers.”

I’m also not sure why Dave is obsessed with the Cards trying to make Hamilton hit the ball on the ground. I have no idea whether that would be better or worse than a fly ball. Lots of fly balls don’t score the runner at third (short ones of course) and lots and lots of fly balls don’t move the runner to third. Same thing with ground balls. Some are base hits, some move both runner, etc. I don’t remember if the IF was playing in or not (probably not), but even if it were, I’m STILL not sure whether a fly ball or ground ball would be better. IOW, I’m not sure whether I want a GB or FB pitcher to pitch, everything else being equal.

Also, I am very uncomfortable when an analyst gets to choose which sample he wants to present to support his point or his opinion. This year only? Last 2 years? 3 years? Career? Lately, as in last half season? You should not be allowed to do that, for obvious reasons (cherry picking your evidence makes your arguments intellectually dishonest, or misleading at best).

For example, Dave said this:

“While Hamilton’s strikeout rate against LHPs jumps to 22.1%, Rhodes K% against LHBs this year was just 16.1%. His career numbers are much better, but he’s not the same pitcher he was a few years ago, and Hamilton had hit an outfield fly against him the night before.”

Yes, he is not the same pitchers, but if this year his K% was higher than his career numbers, Dave would obviously be quoting his career numbers (heck, I would to if I had the choice!). The analysts should NOT have the choice. He should always be quoting a projection which is some kind of weighted career average!

And for the last part of that last sentence, about Hammy hitting a fly ball the night before, David should get immediately thrown into the MGL jail. I can’t believe he even said that in that context. Shame on you Dave!

“…but then you’re essentially inviting Ron Washington to execute a squeeze play. “

I think there is about a zero chance of Hamilton squeezing just because Motte is playing third. That should not even be in the analysis unless you want to use it as a tie breaker in a dead heat.

Finally, Davis didn’t even mention one of the most important part of the puzzle, and one that made LaRussa’s decision likely awful. Lynn is a terrible pitcher! I don’t care how he has done lately (has it been good). I an other projections experts have him as near replacement level! If you know that he is going to bring in Lynn, as opposed to say, Dotel, then it is a no-brainer not taking out Motte (or putting him in the field).

So while David definitely brought up most of the relevant facts in order to determine which option was best, I don’t think that any of us is any closer to the answer….

Managing the 2011 World Series: Game 3, October 22, 9:49 PM:

McCarver:

“Tailing by 2 instead of 1, he has to hold him (Napoli) at third.”

Seriously, does this guy have any analytical skills?

In the 4th inning, it makes almost NO difference whether they are down 1 or 2 in terms of sending the runner.

And that was not such a great throw from Holliday, if only because he did not back up on the catch and then come in. With Napoli running, that should not have been a close play!

2011 WS discussion: Game 4, October 23, 10:19 PM:

Have you ever known anyone in “real” life who does a lot of unconventional things, even when some of them are incorrect, just because they think they are smarter than everyone else and in order to prove that, they have to do things differently? I do.

That is LaRussa in a nutshell. I’ve said this for many years. When I worked for the Cards and I met him for the first time, he basically laughed me out of the room (he has no use for sabermetrics or sabermetricians – none at all)…

Small sample sizitis, October 24, 3:03 PM:

People still don’t get it. If you have little or no predictive value in the population, as we found in the book, than, no matter what you THINK you know, and no matter what SEEMS to make sense, the sample size barely batters.

We can talk about this until we are blue in the face, and we can explain it in detail in The Book, but we still get this:

Oh yeah, I understand, but:

“The match-up numbers became large enough to tell something.”

Tango, I don’t like that you undersell the lack of predictive value we found with batter/pitcher matchups. I mean, you even went so far as to try and increase the size of the samples by using “pitcher families” (OK, not quite the same thing as facing one particular pitcher), and STILL we found nothing.

Can we please put this to rest? No number of PA gives us any more than de minimus value (at most – it still might be NOTHING – given that we found nothing). Use it as a tie breaker if you want – I don’t care. But please don’t use it otherwise as a decision maker no matter how many PA it is based upon (obviously you can’t have hundreds of PA and even then, there is ZERO evidence that that would have predictive value).

Guys: If 50 or 100 PA have any practical predictive value, then we would find it in 20 or 30 PA. We didn’t. We found nothing. Nothing. N-O-T-H-I-N-G.

So, please, pretty please, stop telling us about Earl Weaver and your grandmother who are so smart that they wait for 20 or 30 PA. We looked at lots of guys who had 20 or 30 PA and found zilch. Nada.

We tell you (in The Book) how much to regress clutch (a lot) given a certain sample size. We tell you how much to regress pitcher BABIP. We tell you how much to regress windup/stretch splits. We tell you how much to regress RHB platoon splits. All of these are a lot even with a fairly large sample size.

We didn’t tell you how much to regress batter/pitcher matchups. Do you know why? Because we found NO predictive value, i.e., no “skill”, i.e., all the various unusual results you see are likely due to random fluctuation, at ANY sample size.

And again, for the 10 thousandth time, you cannot use the argument, “You found nothing at 30 PA, but at 50 PA, there IS something.” If you find nothing at 30 PA, as long as your sample of players is reasonably large, then there is nothing or almost nothing at 50 PA or 100 PA.

Do we know all of this for a fact – i.e. with 100% certainty? No! We know virtually nothing for a fact when they are based on inferences derived from sample data, which almost all sabermetric tenets are…

Small sample sizitis, October 24, 4:42 PM:

One of these days I am going to write a Primer for MLB managers – seriously.

Chapter X: “Mr. Larussa (as a metaphor for most managers), throw those index cards away!”

2011 WS discussion: Game 5, October 24, 11:16 PM:

Tony is closing the barn door. Where’s the horse? The horse?????

Managing the 2011 World Series: Game 3, October 26, 3:20 AM:

Bill, I appreciate the input. I also happened to have played high stakes poker for many years – quite successfully as you might imagine. So I kind of know what I am talking about when it comes to poker! ;-)

I don’t usually make reference to things I am not grounded in…

Meaning in hot and cold zones?, October 26, 1:51 PM:

I have not RTFA yet, but I love the idea of estimating true talent from observed data for anything, especially pitch type and location data.

I have thought about this for a long time. It is especially important for advanced scouting and I think that many teams ignore this fact. For example, if two batters are observed to have exactly the same hot and cold patterns, and they are quite far from league average, but one has 400 PA and the other has 4,000, you would pitch each one quite differently (the one with 400, you would pitch him much closer to that of the league average player). I’m not sure teams do that. Same for defending a batter based on spray charts. Those spray charts have to be regressed!

Now, I’ll RTFA!

Batter-pitcher matchups, October 27, 1:28 PM:

I have not read the article yet, but it sounds like truly great work.

So basically what we have already said a million times before: “Use it as a tie-breaker and nothing else.”

Yes, I would like to see how much to regress the batter-pitcher results toward the expected results (for a given number of PA of course) or how much to regress the expected results toward the batter-pitcher sample.

Colin one thing: I assume (again, I have not RTFA yet) you did not control for GB/FB platoon. If you do that, I suspect that the entire predictive value might disappear. Can you perhaps tell us in the extreme cases on both ends what the average GB /FB ratio is for the pitchers and batters?

World Series 2011 Game Thread: Final Game, October 28, 10:00 PM:

That was one of the worst IBB’s you will ever see. Of course I’ve said that many times.

Do you think that Ron Washington even knows that when you IBB the bases loaded, that if the next guy walks, it’s a run?

I don’t care how good a people person he is, the guy is a stone cold moron!

World Series 2011 Game Thread: Final Game, October 28, 10:01 PM:

I wish that I were part of the mainstream media so that I could tell it like it really is…

That seems like a fitting comment to go out on. I hope you enjoyed this edition of The Best of MGL.

Scott Sabol Gets Laughed off the Internet

Scott Sabol finished off his brief, yet stunningly naive, post at Beyond the Boxscore with:

The bottom line: Pitching overall is getting better. Batters are getting fooled more than they did just 5 years ago.

I’m trying to go easy on a guy who’d say that with such conviction based off arbitrary numbers chosen from the Pitch Type, Batted Ball, and Plate Discpline tabs on FanGraphs’ season leaderboard. But that doesn’t mean anything to Colin Wyers, Official Hater of Subjective Statistics:

There’s someone getting fooled here, but it isn’t batters.

Nine minutes later, he added:

You have to go out to three significant digits to see a change in walk rates at all. You see a .016 rise in strikeout rates.

If your data suggests to you that there’s been a 500% increase in swinging at pitches in the strike zone over this time period, then may I suggest to you that what you’ve found isn’t “interesting,” it’s bad analysis caused by bad data.

Justin Bopp (likely the idiot who chose to run this article) is there to rebuild Sabol’s fragile ego:

Lots of good feedback here, Scott.

Keep at it!

It’s obvious that Sabol is a defenseless swimmer in shark-infested waters when matched against Colin Wyers. Anyone with half a brain could see this article made no sense. So, why did Beyond the Boxscore run it anyway? Allow me to give you some advice. FanGraphs may be kicking your ass in page views and popularity, that doesn’t mean you have to rip off their content.

Great/Horrible Moments in FanGraphs Trolling

Dave Cameron announced he has acute myeloid leukemia, which it goes without saying, is sad. FanGraphs trolls managed to keep the comment thread respectful for three and a half hours.

Mike:

If you die, can I have your spot on the staff?

DAVE CAMERONS WIFE:

WOOF! WOOF! WOOF! WOOF! WOOF!

Lou Keemia:

A lot changes in a year…Mariners from 6th best organization in baseball to the worst. Dave Cameron from alive and well to dead as a doornail.

Lou Keemia:

OMG JUST KEEP FIGHTING THE GOOD FIGHT, DAVE. YOU MEAN SOOOOOOOOOOOOOOOOO MUCH TO ME EVEN THOUGH I HAVE NEVER MET YOU BEFORE. YOU’RE PROBABLY THE MOST INFLUENTIAL PERSON IN MY LIFE AND I WILL NEVER FORGET YOU. WORDS CANNOT BEGIN TO DESCRIBE MY FEELINGS FOR YOU. AFTER EVERYTHING YOU’VE DONE FOR ME AND MY FAMILY, INTRODUCING US TO WAR AND UZR, I FEEL LIKE I’D BE LOSING MY FATHER OR MY BROTHER. PLEASE DONT DIE ON ME CHIEF, I CAN’T HANDLE IT. I MIGHT HAVE TO OFF MYSELF JUST TO BE WITH YOU.

HOW DO I LIVE WITHOUT YOU
I WANT TO KNOW
HOW DO I BREATHE WITHOUT YOU
IF YOU EVER GO
HOW DO I EVER, EVER SURVIVE
HOW DO I
HOW DO I
OH, HOW DO I LIVE

Garrett:

Hope you get better. Though this article sucks. Considering statistics will have a meaningful impact on whether you die or not, probably you should choose another anecdote to talk about how statistics are useless. (I’m sure since you don’t use statistics you won’t write a will either. Or make any preparations in case of death. The same as if you were a healthy man in your mid-20s.)

Either way. I look forward to mocking your complete lack of knowledge of personal finance for years to come.

Say My Name Bitch:

Dave, I just want you to know that I am in your corner and I am rooting for you….TO DIE!!!!!!!!!!!

I think Telo summed it up nicely:

And I thought I was the douchebag of Fangraphs. What the hell is wrong with you people?

I must say that I love FanGraphs’ comment rating system. Other sites with an up/down voting system use those votes to determine whether to show or hide a comment and sometimes the order of the thread, too. Even Baseball Prospectus, with a website stuck in 2001 does this. But FanGraphs just displays a big red number next to poorly-rated comments and does nothing else with them. Which is great for me. If I’m skimming through a lengthy thread, I make sure to stop and read all the red ones.

One Idiot with a Stats Software Package

if tRA is FIP having a nightmare SIERA is giving two idiots a stats software package and telling them not to ask questions

Just when you thought this blog was dead and buried, Matt Swartz comes riding to the rescue. At FanGraphs, he has a new five-part series on everyone’s favorite stat, SIERA. Because last time around, as Swartz trumpets, he and his partner in stupidity crime, Eric Seidman, “didn’t totally appreciate why it worked.” And the name “skill interactive” was completely misleading, too. It’s not like you two devoted more than 10,000 words and its own five-part introductory series on Baseball Prospectus about it last winter. This time, though, Swartz has totally got this.

He isn’t shying away, though. He answers the questions SIERA-atics (like myself) have often asked, like, “Why aren’t there more terms in this equation?” To which he says, in Part Two, “Excellent question. I’ve added (BB/PA)^2, (SO/PA)*(BB/PA), a run-environment variable, and percentage of innings as a SP! And all only because they improve my RMSE!” Swartz even managed to flip the sign on one of the preexisting terms with no explanation why.

I don’t know anything about FanGraphs’ business, but bringing on Matt Swartz and letting him revamp SIERA has to be a waste of money. A one-percent improvement over xFIP would be valuable to a team, I imagine, but to the average fan, it’s worth zero. Maybe less than zero when it’s impossible to explain in English the rationale for the stat.  (Though we’ll have to wait until Part Four to see the comparison between the two, I wouldn’t bet the improvement is close to one percent. And there’s always a good chance that the comparisons aren’t done correctly anyway.) So they’re paying Swartz to blather on about something pointless at best and wasting Dave Appelman’s time in having to add it to their database. The rich grandpa lives on.

Somewhat surprising to me is that the FanGraphs commenters are being uncharacteristically kind to Swartz and his Frankenstein stat. Baseball Prospectus commenters, less so.

The Hardball Times Commenters Exist!

SABR love arguing and they love the internet. So arguing on the internet comes naturally. You can find stupid arguments all over the comments of The Book, Baseball Prospectus, BBTF, and a bunch of crappy SB Nation blogs. But you couldn’t, until yesterday, on The Hardball Times. Before then, all you’d see were the occasional person calling out Tuck for being an idiot.

It took an astonishingly-bad article, but look for yourself–a genuine SABR fight. Here’s the play-by-play.

  • 10:29 a.m. – Bob Lee starts off by noting how stupid Paul Francis Sullivan is (13 words)
  • 14 minutes later – Brad Johnson comments (144 words)
  • 2 minutes later – Mark comments (39 words)
  • 4 minutes later – Mark comments again (83 words)
  • 2 minutes later – ecp comments (13 words)
  • 8 minutes later – Brad Johnson comments (119 words)
  • 3 minutes later – Mark comments (83)
  • 5 minutes later – Brad Johnson accuses Mark of being schizophrenic (12 words)
  • 3 minutes later – Mark reveals there is another Mark (31 words)
  • 45 minutes later – the second Mark makes himself known and promotes his theory that there are two Pirates organizations; how ironic (82 words)
  • 34 minutes later – Paul E comments (52 words)
  • 6 minutes later – Brad Johnson comments (112 words)
  • 8 minutes later – Mark comments (142 words)
  • 8 minutes later – Brad Johnson comments (143 words)
  • 16 minutes later – Mark finally responds (170 words)
  • 8 minutes later – Brad Johnson comments (152 words)
  • 7 minutes later – Mark breaks out the caps lock, twice (109 words)
  • 9 minutes later – Brad Johnson is cowed by Mark’s caps lock (88 words)
  • 4 minutes later – Mark gloats (122 words)
  • 9 minutes later – Brad Johnson continues writing for some reason (129 words)
  • 5 minutes later – the second Mark chimes in; he’s very sneaky (149 words)
  • 2 minutes later – original Mark breaks out the caps lock again to seal the victory (140 words)
  • 14 minutes later – confused because Brad Johnson hasn’t commented in over 20 minutes, Mark uses more caps lock (77 words)
  • 19 minutes later – Brad Johnson returns to defend his honor (98 words)
  • 9 minutes later – Mark repeats himself (124 words)
  • 10 minutes later – Brad Johnson repeats himself (126 words)
  • 1 minutes later – Mark openly mocks Brad Johnson (15 words)
  • 5 minutes later – Mark comes back for more, including caps lock (90 words)
  • 1 minute later – Paul Francis Sullivan wants something noted for the record, which I suppose would be this blog; so noted (11 words)
  • 3 minutes later – Brad Johnson accuses Mark of being contrary (129 words)
  • 19 minutes later – Mark continues to troll, compare Pedro Alvarez to his dog (51 words)

Since then, Brad Johnson hasn’t shown his face. He must be ashamed. Mark, for his part, has been proudly trumpeting his victory and taking on any new comers, to the tune of 9 posts and 1,741 words in the last 24 hours.

MGL Channels Travis Bickle

“Do your research; find the actual memo next time.”

You talking to me? What exactly did I get wrong? Not that it matters in the least. Who the hell are you?

And that’s why he gets to be in the header. MGL hasn’t written a more devastating put-down since this gem (one of the best paragraphs on the internet, in my opinion):

Spike, chill! I don’t get a “pass” because I am MGL. My projections are annually in the same league as the best on the planet. That is why I get a “pass.” And because I am considered one of the pre-eminent sabermetricians in the world. You? I didn’t catch your name?

AndrewN, you’ve been MGL’d.

This Week in SABR War

The Tangettes are revolting. Over on The Inside The Book The Book — Playing the Percentages in Baseball Blog (which reminds me of the Official Stephen A. Smith My Blog), you can witness the uprising in comment form against the God-King Tangotiger.

For those too squeamish for uncensored carnage, Tango said something about Stephen Strasburg and how he (that being Tango) is always right. And then we get to the comments. Here are highlight selections, in chronological order.

Mike Fast:

I don’t know what lesson, if any, I’d take from such a small sample, but it certainly would not be the lesson you [Tangotiger] are proposing.

Ken:

I don’t see how you can beat your chest on this topic, if anything I would expect you [Tangotiger] to post a “my bad”

Mike Fast:

Could your [Tangotiger's] rule of thumb still be right, despite Strasburg’s performance? I suppose it could be. But to try to use his performance as proof that you were right is involving some major arm-twisting and severe avoidance of sound sabermetric principles.

Tangotiger:

This is how it works guys. That’s why the Tom Seaver Rule is needed.

David Gassko (with an instant 2010 SABR Comment of the Year candidate):

Tom,

No, no, no, no, and once more, no. You CANNOT say that Strasburg was lucky, because we are not having this argument ex-post. The question of how Strasburg would do came up before he had ever thrown a major league pitch—therefore, there is NO reason to “correct” bias in his numbers. That would be like regressing to the mean twice. If a pitcher posts a 2.00 ERA in a season, maybe his likeliest true talent projection is 3.00. If a pitcher posts a 3.00 ERA, maybe his likeliest true talent is 3.75. But if a pitcher posts a 2.00 ERA, it does not follow that his likeliest true talent is 3.75. Which is what you are currently trying to argue. Strasburg was a pre-selected subject. Therefore, there is no reason to expect bias in his numbers. What happened happened. Oliver was right. You were wrong. End of story.

Tangotiger:

End of story.

You can say all the rest, but don’t say that.

Nick Steiner:

I agree with many of the points you [Tangotiger] make, but this is incredibly disingenuous.

Tangotiger:

And I’m saying that we observed 75% keeps the conversation open. Telling me “end of story” is the same thing as telling me to shut up. I’m talking, and I’ll keep talking, thanks.

Jeremy Greenhouse:

Tango, this doesn’t feel right. I think that you should take a few steps back from this argument and start running some numbers. Brian’s projection of Strasburg is something he should take pride in, and it seems like you’re summarily dismissing his work without evidence of your own.

And Tangotiger gets the last word (for now):

I was fair then in that thread, and I was fair in this thread.

Lest We Forget

James Click, February 2, 2006:

Lest we forget, this is only PECOTA’s third season; wait until you see where BP is three years from now.

evo34, February 27, 2010:

So let’s take a look at how PECOTA projects the top five hitting prospects in baseball to “grow” over the next five years [TAv taken from 10-year forecast]:

Jason Heyward (age 20):
2010: .282
2011: .276
2012: .277
2013: .276
2014: .271

Mike Stanton (age 20):
2010: .265
2011: .264
2012: .265
2013: .259
2014: .257

Desmond Jennings (23 years old):
2010: .269
2011: .269
2012: .278
2013: .273
2014: .270

Buster Posey (22 years old):
2010: .266
2011: .269
2012: .270
2013: .269
2014: .268

Pedro Alvarez (23 years old):
2010: .266
2011: .260
2012: .269
2013: .258
2014: .259

So, basically none of the top five prospects in baseball are projected to improve over the next five years. Apparently, each has already peaked as a mediocre MLB regular. Anyone who has used PECOTA projections over the years will understand how massively different these projections look than those of years past. They (Pease et al.) have essentially diluted the informational content out of prospect projecions to the point where all major prospects are projected to follow an eerily similar career path.

In short, this is worse than New Coke. Someone has significantly changed the algorithm (intentional or not), and there is no documentation of what has changed or why. There is simply no way to trust any of the PECOTA projections for this season — esp. those of prospects. This is extremely unfortunate as long-term projections were the last remaining competitive advantage BP had over competing forecast services (for data forecasts, not editorial content). A full article on this debacle (not another “Unfiltered” side-note) is warranted.

I am not trying to bash BP as much as I am expressing my personal disappointment at not having source for accurate long-term prospect projections for the first season in a very long time. I honestly don’t know of anyone else who takes a numerical approach to evaluating minor leaguers. If anyone does, please post.

So PECOTA makes terrible projections for the top position player prospects and thus PECOTA’s own value falls in a very Heyward’s-career-according-to-PECOTA way. What irony.