An aspect of baseball that has fascinated me for years is the mental game of chess that goes on between the batter and the pitcher during an at bat. Each player is constantly trying to get into the head of their opponent and guess what they might do next. The batter might be using his knowledge of the pitcher to predict whether he will try to challenge the hitter with a fastball or entice him to chase a breaking ball out of the zone. Meanwhile, the pitcher is employing information about the batter to formulate a sequence of pitches that should send him back to the bench with a strikeout.

It is in these mind games that I find statistics and sabermetrics can be applied most effectively. The more relevant data a pitcher or batter has, the larger their advantage is. In one of my previous articles, I quantified how well pitchers hide their pitches and discussed how batters could use this information to identify what a pitcher might be throwing (no trash cans necessary, Astros). However, in this article I want to investigate how pitchers can use information about the hitter to give themselves an edge. In particular, I will be trying to measure the probability that a certain batter will swing at a given pitch.

#classification #machine-learning #mlb #gradient-boosting #baseball

Modeling Swing Probability
1.55 GEEK