In data we trust. Is that such a good idea?

Fortune· Getty Images

“Look at the data.”

I’ve heard that business bromide many times, and it isn’t reassuring.

Companies face pressure to be data-driven, data-centric. Hey, we’re all data scientists now. But data can be manipulated and misunderstood. It can also be just plain wrong.

Flubs abound. In 2022, Equifax provided millions of inaccurate consumer credit scores to lenders. A class action lawsuit followed. Then there’s Uber, whose data quality controls missed an accounting screwup that cost it $45 million in back pay for drivers.

For trust in business data, it’s the best and the worst of times.

Thanks to technological advances over the past five years, software now ably handles “plumbing-like tasks” such as capturing and moving data, says Sean Taylor, cofounder and chief scientist at startup Motif Analytics. So people can focus on more complex analysis, whether it’s data visualization or using data to answer questions.

Overall, data is more trustworthy too. “As the technology is getting better and more reliable, we can believe that what we have in the dataset is more likely to be reflective of reality,” Taylor says.

But it’s a doubled-edged sword. Data literacy and tools have improved so much that many employees can do analysis on their own, which gives Taylor pause. “It’s very hard to know if it’s good, to give broad access to more data to more people.”

One obvious hazard is misinterpretation. “At best, it’s just random mistakes,” says Taylor, a Facebook and Lyft alumnus. “At worst, it’s some kind of systematic mistake that leads to a big blind spot, or something that’s disastrous for your company.”

Another pitfall: motivated reasoning. “Someone wants the answer to be a certain thing, and they can torture the data and use their lack of skill as a way to come up with whatever conclusion that they want,” Taylor says.

Charles Pensig, founding partner of Stratus Data, sees a trade-off between ease of use and discipline. “I think we’re going through with data what we’ve been going through with media,” says Pensig, whose boutique data analytics consulting firm has worked with Disney, Foxconn, and General Electric. “We’ve realized that it is paramount to have an editorial process to ensure that the products that are generated are high-quality.”

For Taylor, that starts with figuring out what data gets tracked or captured. "The first step in quality is designing processes that emit the kind of data that are suitable for the questions that you have.” At Lyft, for example, Taylor and his colleagues tracked drivers canceling rides and measured how long users waited before that happened. Combined, those two things helped show if driver cancellations caused bad experiences for customers.

Second, businesses can seek out flaws or defects. That’s a job for Superconductive, whose Great Expectations platform lets customers test data to confirm their beliefs about it. For example, if a company thinks it has 10,000 product SKUs and testing reveals that it actually has 20,000, something’s wrong with the data.

The third step is having someone check the work, which Taylor compares to a peer review. “If you do the analysis multiple ways and you keep getting the same finding, then you feel confident.”

Pensig’s advice: Don’t collect reams of data now and ask questions later. “How do I encode this stuff so that I can actually use it?”

Make no mistake, that’s a key data point.

Nick Rockel
nick.rockel@consultant.fortune.com

This story was originally featured on Fortune.com

Advertisement