Bad Data “For Good”: How Data Brokers Try to Hide Behind Academic Research

When data broker SafeGraph got caught selling location information on Planned Parenthood visitors, it had a public relations trick up its sleeve. After the company agreed to remove family planning center data from its platforms in response to public outcry, CEO Auren Hoffman tried to flip the narrative: he claimed that his company’s harvesting and sharing of sensitive data was, in fact, an engine for beneficial research on abortion access. He even argued that SafeGraph’s post-scandal removal of the clinic data was the real problem: “Once we decided to take it down, we had hundreds of researchers complain to us about…taking that data away from them.” Of course, when pressed, Hoffman could not name any individual researchers or institutions.

SafeGraph is not alone among location data brokers in trying to “research wash” its privacy-invasive business model and data through academic work. Other shady actors like Veraset, Cuebiq, Spectus, and X-Mode also operate so-called “data for good” programs with academics, and have seized on the pandemic to expand them. These data brokers provide location data to academic researchers across disciplines, with resulting publications appearing in peer-reviewed venues as prestigious as Nature and the Proceedings of the National Academy of Sciences. These companies’ data is so widely used in human mobility research—from epidemic forecasting and emergency response to urban planning and business development—that the literature has progressed to meta-studies comparing, for example, Spectus, X-Mode, and Veraset datasets.

Data brokers variously claim to be bringing “transparency” to tech or “democratizing access to data.” But these data sharing programs are nothing more than data brokers’ attempts to control the narrative around their unpopular and non-consensual business practices. Critical academic research must not become reliant on profit-driven data pipelines that endanger the safety, privacy, and economic opportunities of millions of people without any meaningful consent.

Data Brokers Do Not Provide Opt-In, Anonymous Data

Location data brokers do not come close to meeting human subjects research standards. This starts with the fact that meaningful opt-in consent is consistently missing from their business practices. In fact, Google concluded that SafeGraph’s practices were so out of line that it banned any apps using the company’s code from its Play Store, and both Apple and Google banned X-Mode from their respective app stores.

Data brokers frequently argue that the data they collect is “opt-in” because a user has agreed to share it with an app—even though the overwhelming majority of users have no idea that it’s being sold on the side to data brokers who in turn sell to businesses, governments, and others. Technically, it is true that users have to opt in to sharing location data with, say, a weather app before it will give them localized forecasts. But no reasonable person believes that this constitutes blanket consent for the laundry list of data sharing, selling, and analysis that any number of shadowy third parties are conducting in the background.

No privacy-preserving aggregation protocols can justify collecting location data from people without their consent.

On top of being collected and shared without consent, the data feeding into data brokers’ products can easily be linked to identifiable people. The companies claim their data is anonymized, but there’s simply no such thing as anonymous location data. Information about where a person has been is itself enough to re-identify them: one widely cited study from 2013 found that researchers could uniquely characterize 50% of people using only two randomly chosen time and location data points. Data brokers today collect sensitive user data from a wide variety of sources, including hidden tracking in the background of mobile apps. While techniques vary and are often hidden behind layers of non-disclosure agreements (or NDAs), the resulting raw data they collect and process is based on sensitive, individual location traces.

Aggregating location data can sometimes preserve individual privacy, given appropriate parameters that take into account the number of people represented in the data set and its granularity. But no privacy-preserving aggregation protocols can justify the initial collection of location data from people without their voluntary, meaningful opt-in consent, especially when that location data is then exploited for profit and PR spin.

Data brokers’ products are notoriously easy to re-identify, especially when combined with other data sets. And combining datasets is exactly what some academic studies are doing. Published studies have combined data broker location datasets with Census data, real-time Google Maps traffic estimates, and local household surveys and state Department of Transportation data. While researchers appear to be simply building the most reliable and comprehensive possible datasets for their work, this kind of merging is also the first step someone would take if they wanted to re-identify the data.

NDAs, NDAs, NDAs

Data brokers are not good sources of information about data brokers, and researchers should be suspicious of any claims they make about the data they provide. As Cracked Labs researcher Wolfie Christl puts it, what data brokers have to offer is “potentially flawed, biased, untrustworthy, or even fraudulent.”

Some researchers incorrectly describe the data they receive from data brokers. For example, one paper describes SafeGraph data as “anonymized human mobility data” or “foot traffic data from opt-in smartphone GPS tracking.” Another describes Spectus as providing “anonymous, privacy-compliant location data” with an “ironclad privacy framework.” Again, this location data is not opt-in, not anonymized, and not privacy-compliant.

Other researchers make internally contradictory claims about location data. One Nature paper characterizes Veraset’s location data as achieving the impossible feat of being both “fine-grained” and “anonymous.” This paper further states it used such specific data points as “anonymized device IDs” and “the timestamps, and precise geographical coordinates of dwelling points” where a device spends more than 5 minutes. Such fine-grained data cannot be anonymous.

All of this should be a red flag for Institutional Review Boards, which need visibility into whether data brokers actually obtain consent.

A Veraset Data Access Agreement obtained by EFF includes a Publicity Clause, giving Veraset control over how its partners may disclose Veraset’s involvement in publications. This includes Veraset’s prerogative to approve language or remain anonymous as the data source. While the Veraset Agreement we’ve seen was with a municipal government, its suggested language appears in multiple academic publications, which suggests a similar agreement may be in play with academics.

A similar pattern appears in papers using X-Mode data: some use nearly verbatim language to describe the company. They even claim its NDA is a good thing for privacy and security, stating: “All researchers processed and analyzed the data under a non-disclosure agreement and were obligated to not share data further and not to attempt to re-identify data.” But those same NDAs prevent academics, journalists, and others in civil society from understanding data brokers’ business practices, or identifying the web of data aggregators, ad tech exchanges, and mobile apps that their data stores are built on.

All of this should be a red flag for Institutional Review Boards, which review proposed human subjects research and need visibility into whether and how data brokers and their partners actually obtain consent from users. Likewise, academics themselves need to be able to confirm the integrity and provenance of the data on which their work relies.

From Insurance Against Bad Press to Accountable Transparency

Data sharing programs with academics are only the tip of the iceberg. To paper over the dangerous role they play in the online data ecosystem, data brokers forge relationships not only with academic institutions and researchers, but also with government authorities, journalists and reporters, and non-profit organizations.

The question of how to balance data transparency with user privacy is not a new one, and it can’t be left to the Verasets and X-Modes of the world to answer. Academic data sharing programs will continue to function as disingenuous PR operations until companies are subjected to data privacy and transparency requirements. While SafeGraph claims its data could pave the way for impactful research in abortion access, the fact remains that the very same data puts actual abortion seekers, providers, and advocates in danger, especially in the wake of Dobbs. The sensitive data location data brokers deal in should only be collected and used with specific, informed consent, and subjects must have the right to withdraw that consent at any time. No such consent currently exists.

We need comprehensive federal consumer data privacy legislation to enforce these standards, with a private right of action to empower ordinary people to bring their own lawsuits against data brokers who violate their privacy rights. Moreover, we must pull back the NDAs to allow research investigating these data brokers themselves: their business practices, their partners, how their data can be abused, and how to protect the people whom data brokers are putting in harm’s way.

Related Issues

Privacy

Big Tech

Locational Privacy

Location Data Brokers

Join EFF Lists

Related Updates

Deeplinks Blog by Lena Cohen | September 4, 2025

From Libraries to Schools: Why Organizations Should Install Privacy Badger

In an era of pervasive online surveillance, organizations have an important role to play in protecting their communities’ privacy. Schools, libraries, and other organizations can make private browsing the norm by deploying Privacy Badger on their computers.

Deeplinks Blog by Josh Richman | August 27, 2025

Podcast Episode: Protecting Privacy in Your Brain

The human brain might be the grandest computer of all, but in this episode, we talk to two experts who confirm that the ability for tech to decipher thoughts, and perhaps even manipulate them, isn't just around the corner – it's already here. Rapidly advancing "neurotechnology" could offer new ways...

Deeplinks Blog by Rindala Alajaji, Jason Kelley | August 18, 2025

From Book Bans to Internet Bans: Wyoming Lets Parents Control the Whole State’s Access to The Internet

If you've read about the sudden appearance of age verification across the internet in the UK and thought it would never happen in the U.S., take note: many politicians want the same or even more strict laws. As of July 1st, South Dakota and Wyoming enacted laws requiring...

Press Release | August 13, 2025

Torture Victim’s Landmark Hacking Lawsuit Against Spyware Maker Can Proceed, Judge Rules

PORTLAND, OR – Saudi human rights activist Loujain Alhathloul’s groundbreaking lawsuit concerning spying software that enabled her imprisonment and torture can advance, a federal judge ruled in an opinion unsealed Tuesday. U.S. District Judge Karin J. Immergut of the District of Oregon ruled that Alhathloul’s lawsuit...

Deeplinks Blog by Molly Buckley | August 8, 2025

Americans, Be Warned: Lessons From Reddit’s Chaotic UK Age Verification Rollout

Now that the Online Safety Act has gone into effect, countless problems have begun to reveal themselves, and the absurd, disastrous outcome illustrates why we must work to avoid this age-verified future at all costs.

Deeplinks Blog by Paige Collings | August 5, 2025

Blocking Access to Harmful Content Will Not Protect Children Online, No Matter How Many Times UK Politicians Say So

The UK is having a moment. In late July, new rules took effect that require all online services available in the UK to assess whether they host content considered harmful to children, and if so, these services must introduce age checks to prevent children from accessing such content....

Deeplinks Blog by Tierney Hamilton | August 4, 2025

Digital Rights Are Everyone’s Business, and Yours Can Join the Fight!

Companies large and small are doubling down on digital rights, and we’re excited to see more and more of them join EFF. We’re first and always an organization who fights for users, so you might be asking: Why does EFF work with corporate donors, and why do they want...

Deeplinks Blog by Hayley Tsukayama | August 4, 2025

Data Brokers Are Ignoring Privacy Law. We Deserve Better.

Of the many principles EFF fights for in consumer data privacy legislation, one of the most basic is a right to access the data companies have about you. It’s only fair. So many companies collect information about us without our knowledge or consent. We at least should have a...

Deeplinks Blog by Paige Collings | August 1, 2025

No, the UK’s Online Safety Act Doesn’t Make Children Safer Online

Young people should be able to access information, speak to each other and to the world, play games, and express themselves online without the government making decisions about what speech is permissible. But in one of the latest misguided attempts to protect children online, internet users of all ages...

Deeplinks Blog by Rindala Alajaji | July 28, 2025

You Went to a Drag Show—Now the State of Florida Wants Your Name

EFF has long warned about this kind of mission creep: where a law or policy supposedly aimed at public safety is turned into a tool for political retaliation or mass surveillance. Going to a drag show should not mean you forfeit your anonymity. It should not open you up to...

Search form

Search form

Data Brokers Do Not Provide Opt-In, Anonymous Data

NDAs, NDAs, NDAs

From Insurance Against Bad Press to Accountable Transparency

Related Issues

Search form

Search form

Bad Data “For Good”: How Data Brokers Try to Hide Behind Academic Research

Bad Data “For Good”: How Data Brokers Try to Hide Behind Academic Research

Data Brokers Do Not Provide Opt-In, Anonymous Data

NDAs, NDAs, NDAs

From Insurance Against Bad Press to Accountable Transparency

Related Issues

Join EFF Lists

Discover more.

Related Updates

Follow EFF:

Contact

About

Issues

Updates

Press

Donate