“Big Data” Does Not Tell Us What to Measure, and Ignores What Cannot Be Measured

(p. 6) BIG data will save the world. How often have we heard that over the past couple of years? We’re pretty sure both of us have said something similar dozens of times in the past few months.
If you’re trying to build a self-driving car or detect whether a picture has a cat in it, big data is amazing. But here’s a secret: If you’re trying to make important decisions about your health, wealth or happiness, big data is not enough.
The problem is this: The things we can measure are never exactly what we care about. Just trying to get a single, easy-to-measure number higher and higher (or lower and lower) doesn’t actually help us make the right choice. For this reason, the key question isn’t “What did I measure?” but “What did I miss?”
. . .
So what can big data do to help us make big decisions? One of us, Alex, is a data scientist at Facebook. The other, Seth, is a former data scientist at Google. There is a special sauce necessary to making big data work: surveys and the judgment of humans — two seemingly old-fashioned approaches that we will call small data.
Facebook has tons of data on how people use its site. It’s easy to see whether a particular news feed story was liked, clicked, commented on or shared. But not one of these is a perfect proxy for more important questions: What was the experience like? Did the story connect you with your friends? Did it inform you about the world? Did it make you laugh?
(p. 7) To get to these measures, Facebook has to take an old-fashioned approach: asking. Every day, hundreds of individuals load their news feed and answer questions about the stories they see there. Big data (likes, clicks, comments) is supplemented by small data (“Do you want to see this post in your News Feed?”) and contextualized (“Why?”).
Big data in the form of behaviors and small data in the form of surveys complement each other and produce insights rather than simple metrics.
. . .
Because of this need for small data, Facebook’s data teams look different than you would guess. Facebook employs social psychologists, anthropologists and sociologists precisely to find what simple measures miss.
And it’s not just Silicon Valley firms that employ the power of small data. Baseball is often used as the quintessential story of data geeks, crunching huge data sets, replacing fallible human experts, like scouts. This story was made famous in both the book and the movie “Moneyball.”
But the true story is not that simple. For one thing, many teams ended up going overboard on data. It was easy to measure offense and pitching, so some organizations ended up underestimating the importance of defense, which is harder to measure. In fact, in his book “The Signal and the Noise,” Nate Silver of fivethirtyeight.com estimates that the Oakland A’s were giving up 8 to 10 wins per year in the mid-1990s because of their lousy defense.
. . .
Human experts can also help data analysts figure out what to look for. For decades, scouts have judged catchers based on their ability to frame pitches — to make the pitch appear more like a strike to a watching umpire. Thanks to improved data on pitch location, analysts have recently checked this hypothesis and confirmed that catchers differ significantly in this skill.

For the full commentary, see:
ALEX PEYSAKHOVICH and SETH STEPHENS-DAVIDOWITZ. “How Not to Drown in Numbers.” The New York Times, SundayReview Section (Sun., MAY 3, 2015): 6-7.
(Note: ellipses added.)
(Note: the online version of the commentary has the date MAY 2, 2015.)

Leave a Reply Cancel reply