“I would be surprised if there is even a single person who can answer this tight question definitively,” the engineer said, in an exchange from court testimony that was first reported by the Intercept. Facebook provided the court with a list of 55 systems and databases where user data could be stored.
Tech giants like Google, Facebook, and Twitter were founded more than 15 years ago, and they developed loose cultures in which individual engineers and teams could build databases, algorithms, and other software independently of each other. . Speed was prioritized over security measures that could slow things down. That was before years of privacy lawsuits and legislation pushed companies to tighten up their data practices.
But experts said the companies are still struggling to pay off years of technical debt as regulators and consumers demand more from tech companies, such as the ability to delete data or know exactly what is being collected about a person. And some of those practices that prioritize speed haven’t changed.
“Many engineers at Twitter had an attitude that security measures made their lives difficult and slowed people down,” said Edwin Chen, who has held engineering roles at Twitter, Google and Facebook and is now CEO of the content moderation start-up Surge AI. . “And this is definitely a bigger problem than just Twitter.”
Some of these systems are black boxes even for the people who built them, said Katie Harbath, former director of policy at Facebook and CEO of consultancy Anchor Change (Facebook changed its name to Meta last year). Even if the right policies are in place, they can be difficult to enforce when the underlying databases aren’t built to answer questions like what all the places are where a person’s location or profile might be stored. a person.
“It’s hard to start over, especially the older you get,” she said. “The way these platforms were originally created, each team had a huge amount of autonomy.”
In Meta’s lawsuit, a Northern California class-action lawsuit related to the Cambridge Analytica privacy scandal that the company settled last month, the plaintiffs demanded the company reveal all the information it collects and stores about them. This can include people’s exact locations during the day, health conditions they have sought or groups they have joined, and inferences such as a person’s likelihood of marriage.
Facebook initially provided data from the company’s “Download Your Information” tool, but a judge found in 2020 that the information provided by Facebook was too limited. However, Facebook’s response, recorded in a filing this summer, was essentially that even the companies’ own engineers aren’t sure where all the data resides.
Dina El-Kassaby, a spokeswoman for Meta, Facebook’s parent company, said the filing does not mean the company is failing on security or data access issues. “Our systems are sophisticated, and it should come as no surprise that no single company engineer can answer every question about where every piece of user information is stored,” she said. “We’ve built one of the most comprehensive privacy programs to oversee data usage across our operations and carefully manage and protect people’s data. We have made – and continue to make – significant investments to meet our privacy commitments and obligations, including extensive data controls.”
At Tuesday’s Senate hearing with Zatko, the whistleblower and former security chief made similar comments on Twitter. He noted that in a recent data breach, Twitter had accidentally leaked the personal information of 50 million employees (Zatko’s lawyer later issued a correction saying Zatko meant 20,000).
Zatko pointed out at the hearing that Twitter doesn’t have anything close to that many employees — the current number is 7,000 — and noted that Twitter is keeping too much information about former employees and contractors that it fails to delete.
He repeatedly claimed that the company had as many as 4,000 engineers — more than half of all employees at the company — with extensive access to internal systems and few ways to officially track who had access to what. This was a dangerous situation, he said, because an individual employee could take over a Twitter account and impersonate it.
If that employee were secretly working for a foreign government, the risks of giving employees broad latitude to access user data are much greater. Zatko has alleged that Twitter knowingly had employees working for both the Indian and Chinese governments, but has not provided evidence to support these allegations.
And in a separate report on the company’s ability to handle misinformation that was included in Zatko’s filing with Congress, an independent auditor noted that Twitter lacked a formal system to track cases of users who had violated company rules.
Twitter has repeatedly disputed Zatko’s arguments. A spokeswoman, Rebecca Hahn, previously told The Washington Post that Twitter had tightened up security broadly since 2020, that its security practices are within industry standards and that it had specific rules about who can access the company’s systems. In response to Tuesday’s hearing, Hahn reiterated that Zatko’s arguments were “riddled with inconsistencies and inaccuracies” but declined to specify any details.
David Thiel, chief technical officer at the Stanford Internet Observatory at Stanford University and a former security engineer at Facebook, said that after reading Zatko’s revelations, he got the impression that Twitter’s security processes appeared to be years old. after those of Facebook. He noted that Facebook significantly tightened access in response to various controversies over the years, including the allegation that Facebook had allowed Cambridge Analytica access to user data, to the point that if an engineer broke into a system, they could not had permission to access, “someone will come after you and you will be fired”.
But he said it’s still common in Silicon Valley to give engineers broad access so they can “build interesting products quickly.”
“The emphasis,” he said, “is still on speed and accessibility.”
He said that sometimes companies, including Facebook, cannot know everything that is inside their systems.
For example, machine learning systems and software algorithms consist of tens of thousands of data points, often computed instantaneously. While it is possible to place data points in the system, then one cannot work backwards to get the original entries. He drew a food analogy, noting that it would be impossible to return the soup to its original ingredients.
But other data, he said, is simply complex, and companies are resistant to the extensive work it might take to find it all — and would probably only do so if forced to by new laws or court rulings.
It’s not “so complicated that it’s not feasible,” he said.