What a great video. Destin is always captivating, and I enjoyed the crossover into one of my favorite sports.
My only comment is near the end, where he takes issue with the terms “overstable” and “understable” to describe different disc flights. While it’s the opposite of what one might consider as stability for aerodynamics, it makes perfect sense in the context of the sport: an “overstable” disc is extremely stable in different wind conditions and forgiving of angles.
With the World Baseball Classic coming up, I pulled every baseball player in the Lahman Database whose name perfectly matches a country. Here are some simple statistics among the country representatives with a batting record.
Country Name
Total Hits
Total Homers
Total AB
Batting Average
AB/HR
Jordan
5669
450
21717
.261
48.26
Chad
3854
384
15903
.242
41.41
Germany
2569
56
10346
.248
184.75
France
662
74
2514
.263
33.97
India
482
63
1905
.253
30.24
Chile
142
0
627
.226
NA
Holland
127
3
618
.206
206.00
Jersey
111
2
727
.153
363.50
Portugal
89
2
450
.198
225.00
Poland
39
0
211
.185
NA
Israel
33
6
132
.250
22.00
Monaco
2
0
13
.154
NA
Ireland
1
0
7
.143
NA
Ceylon
0
0
18
.000
NA
Total
13780
1040
55188
.250
53.07
Here’s a fun bit of trivia about these Jordan folks: Until 1999, every player matching Jordan had it as their last name. Since 1999, all but two have the first name Jordan.
I first joined the Society for American Baseball Research (SABR) as a student member around 2016. Back then, I only knew about SABR by the semi-eponymous term “Sabermetrics” that roughly refers to the growing list of statistics and measures used to evaluate baseball. I was studying math and dabbling in computer science with vague hopes and dreams that I could bring those interests to bear on the game I loved. I knew little else about the organization, so when I met some excellent members of the Halsey Hall chapter in Minnesota at TwinsFest that winter, I was a bit shocked to find them all rather old. Not that they couldn’t have been career statisticians or have other relevant skills to analyzing baseball, but I was an introverted college student and didn’t appreciate the interaction.
Not until rejoining SABR four years ago did I realize the scope of their work and how many options I had for volunteering my time and talents to progress their mission. There are a few dozen research committees with varying scopes and interests. The two I’ve been most involved with are wide-ranging and ambitious: the Games Project provides accounts of significant games and their historical context.1This is the only project for which I’ve written an article. The Biography Project writes comprehensive biographical articles of people in and around baseball. These volunteer-led committees are amazing. Some of their processes are charmingly stuck in the past, like their heavy reliance on email server lists. Still, they don’t let anything get in the way of producing well-written and thoroughly researched materials.
SABR is a wonderful organization. It speaks to how well baseball captures the imagination of fans (and nerds) around the world, its historical importance, and the unique aspects of its design as a game that lend it to statistical revelations before any advanced camera-tracking technology was available. While I’m sure my involvement will fluctuate, I’m comfortable saying that my relatively inexpensive membership will be renewed for years to come.
It comes down to extra innings and squandered opportunities, a year of firsts and not quite enough. Series were earned and given away, and the emotional pendulums of games were like rocket-propelled swing sets.
While the modern iteration of Amherst College’s baseball team is approaching three decades of minimal success in NCAA Division III, its origins date back over 165 years. That’s before John Smoltz was regularly announcing how much he hates baseball on national baseball broadcasts, before Nolan Ryan demonstrated the thrilling force of old man strength, before the Shot Heard Round the World, before the Iron Horse, before the Red Sox were cursed or Mordecai Brown lost the end of his index finger.
The team began before rules were consistent.1A note on the genesis of this post: On a train ride back from the Newark airport I searched for “Baseball” in the Hartford Courant archives on Newspapers.com, set my sights anywhere starting in 1839, and sorted by oldest available reference. Since 1839 is apocryphally considered the year when Abner Doubleday established the modern game of baseball, it’s a useful starting point for the search. But it took several more decades before there was standardization. Starting at 11 in the morning on the “cool, clear, and bracing”2Springfield Daily Republican, July 2, 1859. day of July 1, 1859, Amherst faced Williams in the first recorded “Base Ball” game between two colleges.
A note on the genesis of this post: On a train ride back from the Newark airport I searched for “Baseball” in the Hartford Courant archives on Newspapers.com, set my sights anywhere starting in 1839, and sorted by oldest available reference. Since 1839 is apocryphally considered the year when Abner Doubleday established the modern game of baseball, it’s a useful starting point for the search. But it took several more decades before there was standardization.
Mark Armour has worked on his Satchel Paige Project for a few years. It’s an amazing feat of historical research about one of the most enigmatic characters and players in baseball history. It’s worth looking through regardless of your overall interest in baseball.
If you’d like to hear a good conversation about the project, I suggest listening to episode 2352 of Effectively Wild, which is how I first learned about this work.
Throughout college, I ran two Turkey Trots in St. Paul, and two Goldy’s Runs at the UMN Twin Cities campus. None of those 5K races were completed without pauses to walk, and I don’t believe any of them were finished in faster than 35 minutes. I played baseball, which famously doesn’t involve much beyond sprinting. I never thought I’d catch this particular fitness bug.
A couple of weeks ago, I ran a sub-30 minute 5K in my neighborhood. Last Friday, I ran 5 miles on 0.75-mile intervals, with some walking between them. Running has materialized as the next addition to my Year of Fitness, stacking onto my workouts in Fitbod and coinciding with a diet shift that’s proven to be successful.
I’ve toyed with running on and off over the years. I spent several months in San Francisco grinding out 20 or 25 minutes on a treadmill at the university gym. It was fine. Uninspired. Boring. It never stuck.
Since committing to running outside and using interval training on my Garmin watch two months ago, I’ve made incredible strides1Hah, puns. in my running ability. The ensuing success required two mental adjustments:
I can focus solely on my running and not compare it to others.
I already walk thousands of steps each day. Why not earn some of them while running?
I no longer conceive of one mile as the longest feasible distance to run without stopping, and I look forward to my lunchtime jogs as much as I ever desired my lunchtime walks over the years. Running is no longer a terrible burden. Instead, it’s a great solo activity that helps my day feel complete.
I have no ambitions for running. I’m not attempting to train for a race of any particular length, besides an informal one-mile trial against my sisters that we’ve often discussed. Three or four times each week, I check how I’m feeling and decide what kind of running workout to attack: long intervals at a slower pace, or short intervals at a fast pace. I’m confident I can do either combination for at least 3 miles.
Running has, most importantly, opened up my own self-conception to believe in what’s possible with dogged determination. By competing against myself and seeing tangible improvement, I’m encouraged to keep going. I wouldn’t consider myself a runner quite yet—we’ll see how I handle winter—but it’s become a focal point of my 2025 theme, and that will keep it around for a while.
I’m fairly confident all Major League Baseball players have gotten bigger over time, but I specifically decided to use the newest version of the Lahman Baseball Database to look at the average weight of catchers by the decade in which they debuted. Their listed weights are static so we can’t be certain what their debut weights were, but we’re looking at large trends. I also required any catcher in the list to have caught at least 200 career games.
Distributions of MLB catcher weights by debut decade, with averages and number of players inset.
We can also look only at the average weights per decade to get a clearer sense of the overall trend.
Average weights of MLB catchers by debut decade.
There’s a pronounced increase in the 1940s and again in the 1990s through 2000s, the latter of which being when players started eating balanced breakfasts.
Technical Details
I first ran this query in the Lahman Database loaded on my computer.
WITH
"catchers" AS (
SELECT
People.playerID,
People.nameFirst,
People.nameLast,
MAX(People.weight) AS "weight",
SUM(Appearances.G_c) AS "gamesCaught",
SUBSTRING(People.debut, 1, 4) AS "debutYear"
FROM
People
LEFT JOIN Appearances ON Appearances.playerID = People.playerID
WHERE
Appearances.G_c >= 10
AND People.weight > 0
GROUP BY
People.playerID,
People.nameFirst,
People.nameLast
ORDER BY
weight
)
SELECT
"playerID",
"nameFirst",
"nameLast",
"weight",
"gamesCaught",
"debutYear"
FROM
"catchers"
WHERE
"gamesCaught" >= 200
ORDER BY
"debutYear"
I exported the resulting data as catchers.csv and used a Jupyter Notebook for the rest.
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
from os import path
DATA_DIR = '/Users/markrichard/Downloads'
df = pd.read_csv(path.join(DATA_DIR, 'catchers.csv'))
# Remove 2020 partial decade
df = df[df['Decade'] != 2020]
### MAKE DISTRIBUTIONS ###
# Create base plot
g = sns.displot(df,
x='weight',
kind='kde',
col='Decade',
col_wrap=3,
fill=True,
common_norm=False,
aspect = 1.75)
# Calculate and add average lines
decade_stats = df.groupby('Decade')['weight'].agg(['mean', 'count']).round(1)
for decade, ax in zip(g.col_names, g.axes.flat):
if decade in decade_stats.index:
mean_weight = decade_stats.loc[decade, 'mean']
player_count = decade_stats.loc[decade, 'count']
# Add vertical line at mean
ax.axvline(mean_weight, color='red', linestyle='--', linewidth=2, alpha=0.8)
# Add text annotation
ax.text(0.02, 0.98, f'Avg: {mean_weight} lbs\nN: {player_count}',
transform=ax.transAxes,
verticalalignment='top',
bbox=dict(boxstyle='round,pad=0.3', facecolor='white', alpha=0.8))
plt.tight_layout()
plt.show()
### MAKE AVERAGES SCATTER PLOT ###
plt.figure(figsize=(10, 6))
# Simple scatter with consistent sizing
plt.scatter(decade_stats.index, decade_stats['mean'],
s=100,
alpha=0.8,
color='steelblue',
edgecolors='white',
linewidth=2)
# Connect points with a line
plt.plot(decade_stats.index, decade_stats['mean'],
color='steelblue',
alpha=0.6,
linewidth=2)
# Clean styling
plt.xlabel('Decade')
plt.ylabel('Average Weight (lbs)')
plt.title('Average MLB Catcher Weight by Debut Decade')
plt.grid(True, alpha=0.2)
plt.tight_layout()
plt.show()