Mailbag: How Tennis Can Improve Match Statistics
Jared Donaldson—into the top 50 for the first time this week—is our next podcast guest.
While the women play out the season in Singapore….
Mailbag
Have a question or comment for Jon? Email him at jon_wertheim@yahoo.com or tweet him @jon_wertheim.
As someone who has spent time around each of them, who would you say is the smartest of the Big Four?
—Liam, London
• “Smart” is, of course, a bit like “athletic.” Having capability on one axis doesn't mean you have it on another. There are athletes with savant-like hand-eye coordination, who can’t jump over credit cards. We all know of programming wizards or brilliant musicians who sometimes forget that pants go over underwear. What is smart? The person you might want as your SAT tutor is not necessarily the person you want fashioning your escape from the jungle.
But in terms of the Big Four and their intelligence, I’ll make two points. Their “smarts” may be different and may express themselves in different ways. In some cases, it’s facility for language; in others, it’s EQ or social smarts. In other cases (I’ll single out Djokovic by name) it’s an innate curiosity, a desire to seek. For others, it’s a quick wit. But overall, I would say the Big Four, collectively, are remarkably “smart,” and I don't qualify that by saying “for athletes.” Which leads me to point two.
Monica Puig, Maria Sharapova Travel to Puerto Rico to Help With Hurricane Recovery
The question is almost circular. Often as we hear the “dumb jock” archetype—and anyone who’s spent time around sports has examples—I would submit that, almost prima facie, it doesn't apply to the very best. That is, you cannot be a consistent, persistent winner, a long-term champion (which, of course, each of the Big Four is) without a significant measure of smarts. Some of this is mental fortitude and strategic savvy during competition. But this is also smarts in terms of career management. Making the right decisions. Avoiding bad choices. Making the right hires. Managing affairs and relationships. There is an element of intelligence in most every champion.
There's a lot of focus on Federer vs. Nadal as ATP Player of the Year. On the women's side, another veteran—Venus Williams—has appeared in more Grand Slam finals (2) and semifinals (3) than any other player. Does that mean she's a stronger contender for WTA Player of the Year?
—Thanks! Chris
• We all love Venus. We all admire her season. (A player reaches a certain point in their career, and I think we can dial back the no-cheering-in-the-press-box maxim.)
But, while you can be No. 1 and not win a major—cold, objective math—I would submit that you cannot be named MVP without a Big Prize. For better or worse, the majors are the tentpole events. It’s tough to be considered the year’s best when you don’t hold any of them.
It would be nice if she solidified her credentials with a title in Singapore at the year-end shebang. But the 2017 MVP is Garbine Muguruza, who won Wimbledon and garnished it with a Cincinnati title a few weeks later. Not the strongest MVP resume; but stronger than all the others.
First, a bit off the metrics/data question. Grunting is really a sign of a player looking for some advantage, no matter how slight. It is as bad on the men’s side as the women’s. When the grunt is after the ball is in the air it is a distraction to the other plays, to say nothing of it making me crazy.
As to the data/metrics, too much information adds nothing to the pleasure of watching a match. A prime example is the harping on “after winning the first two sets, Federer has never lost a match.” The kiss of death since he then went on to lose the match. Past performance is not indicative of future results, is as true for sport as for stocks. See, i.e., “Halep has never beaten Sharapova”—until she did in Beijing.
I would rather hear Annacone or some of the other good analysts talk about why the last shot did or didn’t work, than to look at graphics of what percentage of shots went to what part of the court. As for Hawk-Eye, I’ll leave a sphere and a representative flat surface for some other time.
Keep on it.
—Emilio Bandiero, Shedd, Ore.
• All strong points. I feel like we should full stop and say this is the challenge of sports analytics and data-driven storytelling overall. There’s a great reader riff below that addresses this, too. But an onslaught of information doesn't necessarily enhance not only a broadcast but also an understanding of what’s going on. And in some cases—statistically insignificant sample sizes, Simpson’s paradox, selection bias, generally dirty data—it misleads. I love data. It’s a force of good. It can be merely cool; it can also be a skeleton key to unlocking riddles. But it’s dangerous in the wrong hands.
I was thinking about this year’s tennis events, and I would argue that 2017 could one day be remembered as the most important year in men’s tennis history. If Federer hadn’t beaten Nadal in Australia and hadn’t later won Wimbledon, both Federer and Nadal would have 17 Slams at this point. The GOAT debate would be at its peak, and this scenario would reshape the whole discussion for years to come. I reckon Federer beating Nadal in Australia may be regarded as the most important match in tennis history and the defining match in their careers. Your thoughts?
—Mauricio Betti, São Paulo, Brazil
• I’m with you on your second point. The last half hour of the 2017 Australian Open men’s final might well be the single most important interval in the history of men’s tennis.
But I suspect that we’ll recall the 2017 season with a bit of sadness, a certain saudade if you will. For very different reasons, the season played out mostly in the absence (or diminished state) of three champions: Serena, Djokovic and Murray. Federer/Nadal was the headline of the year. I think most of us feel fortunate and heartened that this rivalry lives on, strong as ever in some respects. But did it compensate for the absences of three of the other top five stars?
Mailbag: It's Federer vs. Nadal for Player of the Year, But Who Wins?
I have noticed several distinct phenomena that may have turned the tide more to Federer now versus a decade ago.
1. Fed's topspin BH return of serve in 2017
2. Fed's ability to hold serve has never been better than now.
3. Fed's anticipation at net is better in 2017 than 2007
4. Nadal's defense 2017 is not what it was in 2007 (you only have to win the point once now).
5. Nadal's FH 2017 is not as lethal nor as dominating as it was in 2007. (He isn't taking over the point against Fed with his FH like he used to).
Agree/disagree? What are your thoughts? (P.S. Watch some 2007 Fedal YouTube highlights and I believe my points will be validated.)
—Ben, Queens, N.Y.
• These are good. It’s all very situational and surface-based. As we discussed last week, if Nadal had been able to get a crack at Federer on clay, the discussion would be a bit different. I agree that Federer’s fearlessness on the backhand—helped by the larger frame—marks the biggest difference. Someone—I want to say Pete Bodo; and I love Pete so I’ll give him credit here regardless—also mentioned that Federer’s pace of play, always brisk, picks up even more against Nadal. This is not by accident. It disrupts Nadal’s sacred rhythms and sends a message.
As for Nadal, it’s just a matter of consistency. Sometimes his forehand breaks down, especially up the line. Other times, it’s as strong as ever. Sometimes he serves brilliantly (see: 2017 U.S. Open final when he thoroughly out-served 6’8” Kevin Anderson.) Other times, he struggles.
But—and again this cuts both ways—if the two had faced off even once on clay, I suspect it’s a different conversation we’re having.
From Singles to Doubles, Rajeev Ram Talks Life on the ATP Tour
There's a tennis match statistic that I would be interested in seeing: What percentage of points in which a receiver has gained neutrality do they win? Admittedly, "gaining neutrality" in a point is subjective to some extent, but it's always seemed to me that even when a point appears neutral, the server still has an advantage and wins more of the points. I'd like to see if this is true with real statistics, but if so, then this is a mental side of the receiving game that players could work on by trying to adjust their expectation in these points.
—Miles, Hudson, Mass.
• We are still getting lots of questions about tennis statistics and suggestions for improvements. I think we need Jeff Sackman on the podcast. (Anyone interested in the tennis/data intersection should follow him.) This is an interesting issue Miles raises, though it involves some subjective judgments. But, yes it does seem that even when returners set up a neutral ball—effectively neutering the inherent power of the serve—the server still wins the majority of the points. Is this true? If so, why?
Me? I’d like to see some equivalent of “leverage” stats. That is, statistics controlled for the most critical intervals. In baseball, a player who hits a home run when it’s 8-0 in the sixth inning is accorded one leverage rating; when he hits that same home run when it’s tied in the bottom of the ninth, it’s a different, higher one.
We talk often in tennis about “big points.” Why not account for this. Hitting an ace at 40-0 in the first game is one thing; hitting it at 5-5 in the tie-breaker is something else entirely.
I wanted to add a few tips to yours in response to Jays question about experiencing London with his sons, having taken mine (10 & 13) this past July.
- Pubs in Covent Garden—and I would recommend eating in them
- Agree about Indian food: Gymkhana and Dishoom are the better ones, though Dishoom does not accept reservations
- Chinatown
- Agree about Tussauds but the London Eye is a good way to take in the city especially when you have limited time—spring for the express ticket to save in queues
- Thames River Cruise, Hop On Hop Off bus, virtually any museum in London, Wimbledon, Chelsea village
That should be good enough.
—Sanjeet
• Thanks!
Shots, Miscellany
• Today, Monica Puig, 2016 Olympic Gold medallist, and five-time Grand Slam Champion Maria Sharapova visited Puerto Rico, in order to help and support the victims of hurricane Maria. The island of Puerto Rico was severely damaged last month, with residents suffering without power and supplies. Supported by the donations received through Puig’s fundraising page, www.youcaring.com/donatewithmonica, the players delivered goods purchased with more than $140,000 donated funds that were shipped to the island from the U.S. mainland.
Puig and Sharapova handed out supplies, including 1250 gas stoves, 1000 solar powered light/radio units, 3000 propane cylinders, and medicine for San Jorge’s Children’s Hospital. Additionally, 200 cell phones were donated locally by ATT, Puig’s sponsor, as well as pre-packed bags of groceries provided by the Red Cross and the Ricky Martin Foundation.
• Under Armour explores leaving tennis.
• Sam Groth announces his retirement post-Aussie Open.
• 2017 Volvo Car Open champion Daria Kasatkina will return to Charleston in 2018 to defend her title. The 20-year-old Russian will join U.S. Open champion Sloane Stephens, two-time Wimbledon champion Petra Kvitova and U.S. Open finalist Madison Keys in the growing player field.
• Andy Murray announced the launch of an online Charity Fundraising Auction www.amlauction2017.com in the build up to next month’s Andy Murray Live presented by SSE. Andy will play Roger Federer in the tennis exhibition event at The SSE Hydro, Glasgow, which aims to raise as much money for charity as possible. The online auction will run from today until Monday 6th November, with all the proceeds going to charity.
• Ted Ying of Laurel, MD has LLS: Stevie Johnson and Dabney Coleman.
• Saif Shahin of lovely Bowling Green, Ohio takes us out with a reader riff:
I've been reading your articles and Mailbag for a while, but writing for the first time. This is in response to the reader’s Oct. 11 comments on the use of statistics in tennis. I would love to see it in next week's Mailbag. If you do decide to use it, please feel free to edit it for length.
The reader raised the issue of lack of significance testing when reporting statistics in tennis. He also seems to imply that lack of significance tests is the difference between descriptive and inferential statistics.
Two points here. Even descriptive statistics, such as cross-tabulations, can require significance tests. Second and more importantly, he misses the point of significance testing.
Their main purpose is to establish that the patterns witnessed in data are not simply true for the small sample being analyzed but can be generalized to the entire population from which the sample has been collected. So if you do a survey of 5,000 or 10,000 or even 1 million randomly selected people in a country with a population of 320 million, significance tests allow you to say more confidently that your findings are true not just for those small numbers you sampled but for the whole country.
But significance testing has little meaning if you are analyzing not a sample but the entire population. In such analyses, the patterns you find are obviously true for the population as a whole.
That is what mostly happens in data-based tennis articles—or at least the ones I have come across on the ATP/WTA websites and a few other places. Analysts don't use small samples of data about a player. They use ALL of the data related to whatever aspect of his/her game they are analyzing. They can be perfectly confident of the reliability of their findings—way more than if they were using small samples along with reliability tests.
The reader also brings up the issue of correlation versus causation. The difference between them is not statistical but explanatory. Even regression analysis does not imply causal relationships per se, as Ross seems to suggest. Correlations can indicate causal relations too if you can meaningfully explain how one thing leads to another. So a correlation between the increase in Federer's backhand speed, or the number of times he hits it down the line, and the increase in the proportion of points he wins against Nadal can very well indicate a causal relationship.
For me, the sophistication of statistics is not an issue. Use the analysis that is apt for answering a particular question—descriptive or inferential. It is the kind of questions that analysts ask that can at times be ridiculous. I've seen articles offering analyses of reams of data without really saying anything meaningful.
ATP Beyond The Numbers recently published an article on who has held serve more often this year while down 15/40. Nadal and Dimitrov have the best numbers and Federer, who has easily been the best server all year, is not even in the Top 10 (probably because he rarely goes down 15/40). Such analyses have little value as they don't address any meaningful questions. That is the bigger concern with data-oriented tennis reporting.