Republicans in Congress recently voted to repeal the FCC’s broadband privacy rules. As a result, your Internet provider may be able to sell sensitive information like your browsing history or app usage to advertisers, insurance companies, and more, all without your consent. In response, Internet users have been asking what they can do to protect their own data from this creepy, non-consensual tracking by Internet providers—for example, directing their Internet traffic through a VPN or Tor. One idea to combat this that’s recently gotten a lot of traction among privacy-conscious users is data pollution tools: software that fills your browsing history with visits to random websites in order to add “noise” to the browsing data that your Internet provider is collecting.
One of the goals of this post is to dispel misconceptions about these tools regarding problems users may think they solve.
We’ve seen this idea suggested several times, and we’ve received multiple questions about how effective it would be and whether or not it protects your privacy, so we wanted to provide our thoughts. Before we begin, however, we want to note that several seasoned security professionals have already weighed in on the effectiveness and risks involved in using these tools.
While we want to be optimistic and encourage more user-friendly technology, it’s important to evaluate new tools with caution, especially when the stakes are high. Additionally, one of the goals of this post is to dispel misconceptions about these tools regarding problems users may think they solve.
Limitations of ISP Data Pollution Tools
After reviewing these sorts of tools, we’ve come to the conclusion that in their current form, these tools are not privacy-enhancing technologies, meaning that they don’t actually help protect users’ sensitive information.
To see why, let’s imagine two possible scenarios that could occur if your browsing history were somehow leaked.
First, imagine the tool visited a website you don’t want to be associated with. Many data pollution tools try to prevent this by blacklisting certain potentially inappropriate words or websites (or only searching on whitelisted websites) and relying on Google’s SafeSearch feature. However, even with these protections in place, the algorithm could still visit a website that might not be embarrassing for everyone, but could be embarrassing for you (say, a visit to an employment website when you haven’t told your employer you’re thinking of leaving). In this case, it might be difficult to prove it was the automated tool and not you who generated that traffic.
Second, sensitive data is still sensitive even when surrounded by noise. Imagine that your leaked browsing history showed a pattern of visits to websites about a certain health condition. It would be very hard to claim that it was the automated tool that generated that sort of traffic when it was in fact you.
It’s reasonable to assume that whoever is analyzing this data will put some effort into filtering out noise when looking for trends—after all, this is a standard industry-wide practice when doing data analysis on large data sets. This doesn’t necessarily mean that the data analysis will always beat the noise generation, but it’s still an important factor to consider. Likewise, layering noise onto a prominent pattern will not make that pattern any less prominent. Additionally, your Internet provider may already have years of data about your browsing habits from which it can extrapolate to help with its noise-filtering efforts.
Even if these specific problems were solved, we would still be reluctant to say that data pollution software could successfully protect your privacy. That’s because this kind of traffic analysis is an active area of research, and there aren’t any well-tested large scale models to show that these techniques work yet.
In other words, there are currently too many limitations and too many unknowns to be able to confirm that data pollution is an effective strategy at protecting one’s privacy. We’d love to eventually be proven wrong, but for now, we simply cannot recommend these tools as an effective method for protecting your privacy.
Changing Internet Provider Behavior is a Worthy Goal, but Your Energy is Better Spent Calling Congress
Data pollution tools aren’t likely to succeed at their other primary goal besides protecting privacy: convincing Internet providers to stop mining our data to sell targeted ads. The theory here is that if enough people used these tools, then the vast majority of browsing data Internet providers collected would be inaccurate. Inaccurate data is worthless for targeting ads, so there would no longer be any monetary incentive for Internet providers to try to sell targeted ads—and thus no incentive to keep collecting browsing data in the first place.
Unfortunately, a huge fraction of customers would have to be using data pollution tools for them to have an impact on major Internet providers’ bottom lines. And while it's wonderful to imagine the majority of Internet users up in arms and installing one of these projects, it'd be as useful (if not more so) for all these users to call their lawmakers directly and convince them to pass privacy-protecting legislation instead. In fact, it would probably take far fewer people to get Congress to change its mind than it would to affect a large Internet provider’s bottom line.
Culture Jamming for the Web
With all of that said, these tools could potentially be effective at one thing: confusing your Internet provider’s ad-targeting algorithms and making the ads they show you less relevant. If this sort of culture jamming appeals to you, then these tools could help you accomplish that. Just keep in mind that you’ll have to rely on other techniques to protect your privacy from your Internet provider, and that to really achieve the sort of change we need, we also need to take the time to talk to our lawmakers and make our voices heard directly. Only through a combination of activism, technology, and legislation will we truly be able to protect our privacy online.