Security issues and outages frequently affect tech companies, but it’s the way that they respond that says a lot about the company.
On July 2, internet services provider Cloudflare experienced a global outage across its network, which resulted in visitors to Cloudflare-proxied domains to be shown 502 errors or “Bad Gateway.” It raised a huge concern given that the nature of Cloudflare is to protect website visitors from security threats.
About a week later, in over 4,000 words on its company blog, Cloudflare shared in detail the series of events that led to the issue, what caused the issue, and how it corrected the situation as well as how it will be continuing to go “deeper to protect against any further possible problems” by replacing the underlying technology.
How Cloudflare responded to the outage shows a company that knows how to communicate and to be transparent with its users— and, ultimately, being able to admit when it isn’t perfect.
Notably, Cloudflare did not cover up its failures; instead, it openly admitted fault. The outage appeared to have been caused by an engineer who improperly set configuration settings and other compounding factors, according to the blog post.
“We know how much this hurt our customers. We’re ashamed it happened. It also had a negative impact on our own operations while we were dealing with the incident,” Cloudflare wrote. “It must have been incredibly stressful, frustrating, and frightening if you were one of our customers. It was even more upsetting because we haven’t had a global outage for six years.”
On Twitter, various people, including engineers, credited Cloudflare for transparency.
Thanks for the detailed explanation, which we can all learn from. Most companies would probably try to cover up their failures. Now going to verify/profile some of my RegEx’s.
— Stefan Holm Olsen (@StefanHolmOlsen) July 14, 2019
Major respect for posting this. Cloudflare nails all of the important points of a post-mortem, leaving out any ambiguity.
— Ryan Hickman (@ryanhickman) July 12, 2019
The Cloudflare outage is an example of how communication between leadership and customers is critical for tech companies, says Ben Auton, vice president of operations at SpearTip, a cybersecurity advisory firm. “Cloudflare seemed to implement proper incident management and took the issue very seriously,” he says.
Cloudflare not only clearly indicated to the public the details of what went wrong but also posted a more technical explanation, which, in turn, helps other companies and individuals learn from what happened.
In addition, the company used a system status page to communicate the exact status of the problem to the public and what they were doing to resolve it, giving users the latest updates. Companies like Dropbox and Amazon also use status pages to document security concerns as well, according to Auton.
This approach seems to stand in contrast with Zoom’s recent handling of a security flaw that enabled any Mac user with Zoom software to be forced to join a video call. The company initially dismissed the vulnerability. It was only after a software engineer posted a viral Medium post exposing the security concern and adverse reactions from media ensued that the company removed the feature causing the vulnerability. Auton points out that Apple also stepped in and updated the MacOS to remove the feature linked to the security concern. While Zoom addressed the security concerns via its company’s blog, the post was more general rather than technical, suggesting the limits to how much the company wanted to expose.
A few days after the security flaw came out, on July 10, Zoom CEO Eric S. Yuan admitted to misjudging the situation and responding too slowly. “We take full ownership and we’ve learned a great deal,” Yuan wrote. In the future, Zoom says it plans on implementing more formal processes for people to submit security concerns, according to the post, including a “public vulnerability disclosure program” and an improved “process for receiving, escalating, and closing the loop on all future security-related concerns.”
The recent issue that Zoom faced also is a reminder to the newly public company that the bar has now been set higher. Auton says that tech giants Facebook, Google, and Microsoft have experienced a slew of security concerns over the years, and are learning that customers are increasingly demanding greater transparency. And as Cloudflare’s blog post shows, that’s not to say that users are expecting perfection. Rather, what they want is a proper response.
Sign up for the Quartz Daily Brief, our free daily newsletter with the world’s most important and interesting news.
More stories from Quartz: