Baseball
Add news
News

Median vs. Average Runs Per Game: Thoughts on Pythag and Situational Hitting

0 1


Some runs are more important than others. In 2017, the 2nd run scored increased win likelihood the fastest (by 16%). But beyond the 7th run, each additional added 4% or less additional win likelihood.

2017 W% by RS

Scoring runs 2 through 7 have the greatest effect on winning

Reliably scoring 4 or 5 runs every game should lead to better records than when runs are scored feast-or-famine.

Compare these two make-believe teams (for an explanation of my decimal-value medians, see ‘Methodology’ at the end of this fanpost):

fake_royaks_10_games.0.jpg

What about the real world?


Experiment and Results - Offense

During the three post-COVID seasons, 2021-2023, the greater a team’s median > average RS, the likelier its true record outperformed its pythag record. Teams whose medians fell below their RS averages tended to underperform pythag.


3-year_o-u_revised_chart.0.jpg
3-year_o-u_median_pythag_summary.0.jpg

The poster child of this median/average effect is the 2021 Seattle club. They were outscored by 51 runs on the season, but they finished 18 games over .500. Their median exceeded their average RS by 0.21/game.


However, for the three-year sample, statistical noise (+- 8 W’s per team/season) was louder than the trendline (+- 2.5 W’s per team/season). We should compile a few more seasons of data before proclaiming this median/pythag correlation as fact.


Experiment and Results - Defense

No relationship was found involving median/average defensive runs allowed (RA), wins, and pythagorean wins.


Why?

Please jump in the comments with your own ideas. Mine would be situational hitting.


If, possibly, it were occasionally in a team’s best interests to give up an out (or otherwise reduce offensive productivity in an at-bat, such as shortening a swing or aiming for the right side of the infield) to increase their odds of scoring just one run; then


It follows that teams who effectively practice ‘situational hitting’ would have higher bottoms (less 0- and 1-run games, more 4-run games) but lower tops (fewer 10-run games); then


It follows that opposing defenses, recognizing the efficacy of situational hitting, would take extra risks to mitigate its effects, such as drawing in all defenders (raising likelihood of outs but also of extra-base hits and hence occasional large RA outliers); then


It follows that effective situational hitting could be ‘proved’, over large sample sizes, by median>average RS positively impacting W’s AND by median>average RA having no discernible effect on W’s.


Another reason that pitching might fail to show correlation might be managerial choices in low leverage. When a team can afford to give up 5 more runs is a perfect time to cycle through specious bullpen inventory and allow the opposition a crooked number.


Median Pythagorean Standings by Season, 2021-2023

Thane-W1: 1st-order wins computed from median runs scored. See Methodology at the end of this fanpost for more details.

Clay-W1: Clay Davenport's pythagenport 1st-order wins computed from total runs scored.

Clay-W2: Clay Davenport's 2nd-order wins, adjusted with advanced run-producing metrics.

Clay-W3: Clay Davenport's 3rd-order wins, additionally adjusted by strength-of-schedule.

Total Deviations: The sum of the differences between all 30 teams' adjusted and actual wins. Total Deviation for W would be 0.

2021_pythag.0.jpg 2022_pythag.0.jpg 2023_pythag.0.jpg

Current 2024 Median Pythagorean Standings

Through Sunday, May 12:

2024_pythag.0.jpg

Last Thoughts

Homers and strikeouts remain the most basic way to win and lose ballgames. But for those teams who coach situational hitting without harming basic baseball skills and instincts, selectively applying minimax game theory may buy up to a handful of extra wins over a season.


Further research is needed to determine if median exceeded average RS correlates the same to outperforming pythag for high/low-scoring teams or for high/low-scoring eras.

Methodology

Taking all scoring data from 2021-2024, runs scored and runs allowed were averaged and medians computed for all teams1. These median RS values, and actual RA values, were then plugged into a modified pythagorean win-percentage model2.

Notice how smaller sample sizes not only have greater risk of outlier data, but they also make the median data less precise. (Medians are fractionally computed by 4ths or 5ths instead of by 30ths.) Extra innings also affect data asymmetrically, especially in smaller samples.

1 The midpoint of a numerical data set; but where the midpoint is a duplicate, I fractionally metered out the distance between it and the next higher integer. ONLY KEEP READING THIS NOTE if you want to challenge your math brain and/or question my methodology … For example, if the median is between the 13th and 14th of 31 4’s [68 games scoring 3 or fewer runs; 31 games scoring 4; 63 games of 5 or more] the value I entered was 4 and 25/62, or 4.40. (25/62 is 12.5/31. The 1st of the 4’s I score as 0/31, or 4.00; this avoids the 31st 4 scoring as 5.00.) But wait … there’s one more step … using actual and ‘hypothetical median’ runs scored across the entire league, each season has a coefficient (total league-wide average/median RS), so that we guarantee RS = RA. This coefficient is fairly small, typically moving the median RS about 0.02.


2 Instead of Bill James’ initial (RS^2)/[(RS^2)+(RA^2)], or pythagenpat ((rs+ra)/g)^0.287, or pythagenport, etc., I created a somewhat simplified version, WIN PCT = 1/(1+((RA/RS)^1.6). If anyone’s curious, ask in the comments, and I’ll try to explain how I arrived at this version.

Загрузка...

Comments

Комментарии для сайта Cackle
Загрузка...

More news:

Read on Sportsweek.org:

South Side Sox
Azcentral.com: Arizona Diamondbacks

Other sports

Sponsored