2022 Blogmas Day 4 – Weighted On Base Average

You may have recognized that OBP and SLG each measure slightly different things. OBP is only a measure of whether you got on base, while SLG makes a big fuss over your hits in particular, and how “big” they were. OPS (and even better, OPS+) are a quick summary of these combined, which is cool. However, OPS treats OBP and SLG as essentially equal in their importance, while getting on base is demonstrably of higher value. If you want a more accurate measure of offensive production that is (relatively) simple to calculate, look no further than weighted on base average, or wOBA.

A hit has a different effect on the game than a walk, and different kinds of hits also affect the game differently. That’s the basic tenet encouraging us to use a weighted average that accounts for the changing offensive landscapes across baseball. OBP is better than BA because it includes walks, an important offensive metric; SLG is an alternative measure that roughly considers the value of hits, but only in a naive way. We’re left uncertain how comparable these are, and if they truly measure value. What about comparing players from different seasons, when the run-scoring environment was different?

The ingenuity of wOBA is to consider the true value added each time a batter makes it on base, using the entirety of the season to calculate said value. Then, we weight each element of OBP by these calculated factors. While SLG assumes that a double is twice as valuable as a single, it turns out that a double is typically only about 1.4 times as valuable as a single. And, intuitively, we also know hitting a single is overall better than just a walk, because runners can advance more than 1 base on a single.

Let’s jump straight to the equation using 2022 statistics. It’s more or less what you’d expect: a weighted sum of every way you can get on base (we ignore intentional walks, so uBB means “unintentional walks”), and divide by the number of opportunities that would allow a runner to get on base.1Note that this is not just plate appearances, like OBP, or at-bats, like BA. It’s a mix of at-bats, plus times where you get walked, and sacrifice flies. Note that we only include sacrifice flies, not bunts. A sacrifice fly is almost always a bit lucky; you hit a big fly ball, and because a runner happened to advance, we count it as a sacrifice. That same fly ball could have landed, or could have happened with nobody on base, in which case it would just be an at-bat. If you bunt, though, it’s very intentional, so we don’t count it against the batter.

\begin{aligned}
\text{wOBA} = &(0.689\times \text{uBB} + 0.720 \times \text{HBP} + 0.884 \times \text{1B} + \cdots \\
&1.261\times\text{2B} + 1.601\times\text{3B} + 2.072\times\text{HR})\\
&\div(\text{BB}-\text{IBB}+\text{SF}+\text{HBP})
\end{aligned}

A natural question is Where do these weights come from? Great question! I’m not going to reinvent the wheel, so if you want all the gory detail you should check out this FanGraphs page. But, I will give a quick summary.

Linear Weights, In Brief

In brief, this all works because so many plays occur in MLB that it’s effectively a random number generator. The statistics are meaningful because the sample size is so large, and each player’s contributions to the net statistics are negligible.

First, we create a matrix of base-running situations. We can have 0, 1, or 2 outs in an inning; and there are 8 possible combinations of runners on base.2Nobody on; runner on 1st, 2nd, or 3rd; runners on 1st and 2nd, 1st and 3rd, or 2nd and third; or bases loaded. This gives us 24 possible situations. The entries in the matrix are the run probabilities using situational data from the season: runs scored in the inning from the point where that situation occurred, divided by the number of times each situation occurred.

Each time a batter gets on base, they move from one entry in the matrix to another; that changes the run-expectancy. For each type of event, we sum these changes across all events, and divide by the number of events.

Now we’ll have some run-expectancy value for every event in the numerator of our equation. But, this is not our final set of factors. An out is also something that can change our run-expectancy, but to get wOBA to “act like” OBP, we scale our factors so that an out has no effect.3We find the factor for all outs across the league (say it’s -0.26), and subtract it from every factor. Since the value of an out is negative, this will increase all of our other factors.

Finally, we calculate a tentative league-wide wOBA using these new weights. If it’s different from the league OBP (it will be), we now adjust the factors by a constant until the league’s wOBA exactly matches the OBP. This means we have a comparative scale with which to understand wOBA: it’s the same scale we use to determine if someone’s OBP is good.

And We’re Back

wOBA is an excellent statistic. Because it’s directly calculated by considering the adjustment in run-scoring potential during each trip to the plate, it can be interpreted as the number of runs a player is worth per plate appearance.4We’ll use it later in this series to derive a counting statistic that measures the number of runs a player contributes to their team. As a rate statistic, it is perhaps the best measure of how does a player contribute offensively when they play? while still using freely available data.5Of course, it’s a bit tedious to do the calculations to find the weights. Yet, it’s eminently doable.

It’s a natural extension to some of the naive statistics that don’t account for how different plays can drastically affect the game, yet is scaled in a way that remains understandable for fans just getting into advanced metrics.

As always, here are some fun facts to finish your day with.

  • The player with the highest wOBA in 1996, my birth year, was Chuck Knoblauch of the Minnesota Twins. His wOBA was .422.
  • The player between the ages 30 and 31 with the highest single-season wOBA was Jimmie Foxx in 1938, with a wOBA of .508.

Continue to Day 5 – Three True Outcomes

  • 1
    Note that this is not just plate appearances, like OBP, or at-bats, like BA. It’s a mix of at-bats, plus times where you get walked, and sacrifice flies. Note that we only include sacrifice flies, not bunts. A sacrifice fly is almost always a bit lucky; you hit a big fly ball, and because a runner happened to advance, we count it as a sacrifice. That same fly ball could have landed, or could have happened with nobody on base, in which case it would just be an at-bat. If you bunt, though, it’s very intentional, so we don’t count it against the batter.
  • 2
    Nobody on; runner on 1st, 2nd, or 3rd; runners on 1st and 2nd, 1st and 3rd, or 2nd and third; or bases loaded.
  • 3
    We find the factor for all outs across the league (say it’s -0.26), and subtract it from every factor. Since the value of an out is negative, this will increase all of our other factors.
  • 4
    We’ll use it later in this series to derive a counting statistic that measures the number of runs a player contributes to their team.
  • 5
    Of course, it’s a bit tedious to do the calculations to find the weights. Yet, it’s eminently doable.

Leave a Reply