Quality control: The numbers behind John Lowe's quality start stat

As the inventor of the quality start retires, a deeper look at the numbers behind a maligned and underappreciated piece of baseball statistics.

Jay Jaffe | Oct 23, 2014

Quality control: The numbers behind John Lowe's quality start stat /

Earlier this week, John Lowe, a writer for the Detroit Free Press for the past 29 years — and an extremely well-respected one at that — announced his retirement. If you don't know Lowe by name, you almost certainly know at least one facet of his legacy, for he invented a statistic: the quality start.

The year was 1985, and Lowe was writing for the Philadelphia Inquirer (he moved along to Detroit the next year). Noting the decline of the complete game and the evolving philosophy of managers with regards to their expectations for their starting pitchers, he strove to find a descriptive stat that recognized this change. As he told Murray Chass in 2011:

"I got the idea in 1983 and '84," Lowe said. "I was hearing managers saying they were looking for six innings from their pitchers. I heard Whitey Herzog say 'all I want from my pitchers is six good innings.'"

That's where six innings came from. And the runs? "Six and two is too stingy, six and four is too much. I wasn't going to get into a more than or less than. This was new and had to be understandable."

Why the need for a new statistic? "I didn’t like ERA as a definitive stat," Lowe said. "One bad start could wreck your ERA. But I never said don't look at wins and losses."

Thus was born the quality start, credited to a starter for any outing in which he pitches at least six innings and allows no more than three earned runs. Due to the vagaries of offensive and bullpen support, that starter may not actually get a win for his effort, but by and large, he's done a good job of keeping his team in the game. Note that while Lowe didn't explicitly distinguish between earned and unearned runs in answering Chass, he clearly wanted something in the same ballpark as ERA.

After years of October heartbreak, Giants' Hudson gets World Series shot

Historically speaking, quality starts occur in a bit more than half of all games, with their rate essentially mirroring scoring levels. One stat's peak is the other's nadir; since 1950, the extremes on either side for quality starts have coincided with the highest and lowest scoring levels in that span. In 1968, the "Year of the Pitcher," teams scored an average of 3.42 runs per game (the lowest level since 1908) and pitchers made quality starts 62.6 percent of the time. In 1996, when teams averaged 5.04 runs per game (the highest level since 1936), pitchers made quality starts in 45.8 percent of games.

In 2014, when teams scored an average of 4.07 runs per game, pitchers made quality starts 54.0 percent of the time, up from 52.6 percent of the time in 2013. As you'd expect, the best pitchers made quality starts with the highest frequency. Here are the top 10 in each of the AL and NL in terms of rate:

rank	pitcher	team	gs	qs	qs%
1	Clayton Kershaw	Dodgers	27	24	88.9
2	Johnny Cueto	Reds	34	29	85.3
3	Cole Hamels	Phillies	30	25	83.3
4	Alex Wood	Braves	24	19	79.2
5	Adam Wainwright	Cardinals	32	25	78.1
6	Aaron Harang	Braves	33	25	75.8
7	Julio Teheran	Braves	33	25	75.8
8	Jordan Zimmermann	Nationals	32	24	75
9	Lance Lynn	Cardinals	33	24	72.7
10	Doug Fister	Nationals	25	18	72

Rank	pitcher	team	gs	qs	qs%
1	Jon Lester	Red Sox/Athletics	32	27	84.4
2	Chris Sale	White Sox	26	21	80.8
3	Felix Hernandez	Mariners	34	27	79.4
4	Sonny Gray	A's	33	26	78.8
5	Corey Kluber	Indians	34	26	76.5
6	Garrett Richards	Angels	26	19	73.1
7	Yordano Ventura	Royals	30	22	73.3
8	Dallas Keuchel	Astros	29	21	72.4
9	David Price	Rays/Tigers	34	24	70.6
10	James Shields	Royals	34	24	70.6

For those tables, I stuck with the 162-inning cutoff used for ERA qualifiers, which means that Masahiro Tanaka (80 percent in 20 starts), Jacob deGrom (77.3 percent in 22 starts) and Hyun-Jin Ryu (73.1 percent in 26 starts) were bumped, though they at least deserve mention. Those short-season pitchers aside, that's just about everybody you'd place on a Cy Young ballot in either league, not to mention just about everybody from the Wins Above Replacement leaderboards. Everybody who's anybody is making quality starts.

Royals join ranks of postseason's best bullpen performances

And yet the stat has its critics, though not unreasonably so. One primary criticism is that the statistic rewards mediocrity or worse, namely by granting credit for an outing in which a pitcher produces a 4.50 ERA (six innings, three runs). During the season that Lowe invented the stat, the major league ERA was 3.89; it had been exactly 4.00 in 1977 and 1979 but hadn't been around 4.50 since 1936, part of a much higher-scoring era. A 4.50 mark was out of step with the times, though Lowe couldn't have foreseen that scoring levels would rise sharply in the 1990s and 2000s. The major league ERA during the strike-shortened 1994 season was 4.51, and from that year until 2007, it was within one-tenth of a run (0.10) of 4.50 nine times, topping it five times with a high of 4.77 in 2000.

Cumulatively, the major league ERA was 4.48 for the 1994-2007 stretch, before scoring levels began to taper off significantly; it's been below 4.00 in three of the past four seasons, with this year's 3.74 mark the lowest since 1989 (3.71). Over a broader swath of history, the major league ERA is lower; from 1961 (the start of the expansion era) to 2014, it's 4.00, while from 1973 (the start of the designated hitter era) to 2014, it's 4.10, and from 1993 (the start of the more recent expansion wave, and the point when scoring and home run levels began to soar) to 2014, it's 4.35.

Given that, the 4.50 ERA threshold doesn't represent average, but actually functions more like a replacement level — a baseline that has real value when it comes to measuring performance. That isn't to say that 4.50 is an ideal place to set that line, given the degree to which scoring levels fluctuate over time. As I'll show below, a 4.50 ERA produces a winning percentage that's below .500 but still well above the .320 percentage that is used by both Baseball-Reference.com and FanGraphs (which is to say that a team full of replacement level players would win 32 percent of its games).

Three Strikes: World Series proving that 'pen is mightier than ever

Critics of the quality start stat also complain that it preserves the artificial distinction between earned and unearned runs, a distinction set more than a century ago, when fielders' mitts were nonexistent or rudimentary and errors abounded. In 1901, the year the American League came into existence, 32 percent of all runs were unearned, whereas they've accounted for less than 10 percent of all runs every year since 1991, and just 8.3 percent this year.

Given that it's the pitcher's job to prevent all runs, earned or unearned, and that we now have more sophisticated ways to account for the separation between pitching and defense (such as the Fielding Independent Pitching stat), there's certainly an argument to be made for ditching the distinction — far beyond the realm of quality starts. But in order to understand the stat better, I'm ignoring that for now.

Back to what we'll call the 4.50 case — by which I mean exactly six innings and three earned runs. A few years ago, it inspired me to dig deep into the concept of quality starts. What its critics don't realize is how rarely the 4.50 case occurs; from 1950-2010 (the span I used when I wrote my piece in 2011, given that RetroSheet data only went back so far), it accounted for 5.9 percent of quality starts and just 3.0 percent of all starts. In 2014, it accounted for 8.5 percent of quality starts and just 4.6 percent of all starts. In those games alone — games in which starters delivered exactly six innings with three earned runs allowed — teams went 98-124, for a .441 winning percentage.

Steady, dominant bullpen helping Royals prove they belong on big stage

Given that, you might think that there isn't much separation between teams that receive a quality start versus teams that don't, but you'd be far off base. Historically, teams receiving quality starts win around two-thirds of the time. For the post-1960 expansion era, their winning percentage is .674; for 2014, it was .660. What's more, there's a massive gulf in collective performance between pitchers who put up quality starts versus those who don't; generally, the former group has an ERA a bit below 2.00, the latter above 7.00. In 2014, the split was a 1.88 ERA for those pitching quality starts, 6.97 for the rest. That's actually very similar to the performance of all pitchers in wins (1.82 this year) and losses (7.33).

Many who are conceptually on board with the idea of counting some kind of quality starts take issue with the thresholds that Lowe defined. ROOT Sports, the channel that broadcasts Mariners games and thus King Felix’s starts, introduced two variants: the "ultra quality start" (at least seven innings, with no more than two earned runs) and the "mega quality start" (at least eight innings, with no more than one earned run). CBS Sports' Dayn Perry offered the "dominant start," dispensing with the earned/unearned run distinction and considering only starts of at least eight innings with no more than one run allowed.

All of those are reasonable points to draw the line, perhaps more useful than six innings and three earned runs, thresholds I've defended in the past given their broad applicability across large swatches of baseball history. But having spent hours sifting through Play Index data for a notion of where a better place to draw the line might be, I'm struck by the diminishing returns. For example, if we were to redefine a quality start as seven or more innings, three or fewer earned runs, we'd find that in 2014, that described just 29.1 percent of starts (compared to 54.0 for the 6/3 thresholds), and teams receiving such starts had a .707 winning percentage (up from .660). The "ultra quality start" definition describes 25.2 percent of 2014 starts, and teams post a .741 winning percentage. Such starts are much scarcer commodities, but does drawing the line there tell us significantly more than the original stat?

I'd argue not, for if we want more sophistication in recognition of quality, we can turn to FIP or WAR. Having something distinct from those two — easy enough to calculate at a glance when combing through the morning box scores — is why I continue to use the original definition of the quality start and to marvel at the stat's simple elegance. Thus, my tip of the cap to the retiring John Lowe, for yet another quality piece of work.

Published Oct 23, 2014

JAY JAFFE

Jay Jaffe is a contributing baseball writer for SI.com and the author of the upcoming book The Cooperstown Casebook on the Baseball Hall of Fame.