Special thanks to former legal intern Hinako Sugiyama, who was a lead co-author of this post.
Technology users around the world are increasingly concerned, and rightly so, about protecting their data. But many are unaware of exactly how their data is being collected and would be shocked to learn of the scope and implications of mass consumer data collection by technology companies. For example, many vendors use tracking technologies including cookies—a small piece of text that is stored in your browser that lets websites recognize your browser, see your browsing activity or IP address but not your name or address—to build expansive profiles about user behavior over time and across apps and sites. Such data can be used to infer, predict, or evaluate information about a user or group. User profiles may or may not be accurate, fair, or discriminatory, but can still be used to inform life-altering decisions about them.
A recent data privacy scandal in Japan involving Rikunabi—a major job-seeking platform that calculated and sold companies algorithmic scores which predicted how likely individual job applicants would decline a job offer—has underscored how users’ behavioral data can be used against their best interests. Most importantly, the scandal showcases how companies design workarounds or “data-laundry” schemes to circumvent data protection obligations under Japan's data protection law (Act on the Protection of Personal Information (APPI)). This case also highlights the dangers of badly-written data protection laws and their loopholes. Japanese Parliament adopted amendments to the APPI, expected to be implemented by early 2022, intended to close some of these loopholes, but the changes still fall short.
The Rikunabi Scandal
Rikunabi is operated by Recruit Career (at the time of the scandal. It’s now Recruit Co., Ltd.), a subsidiary of a media conglomerate Recruit Group, which also owns Indeed and Glassdoor. Rikunabi allows job-seekers to search for job opportunities and mostly caters to college students and others just beginning their careers. It hosts job listings for thousands of companies. Like many Internet platforms, Rikunabi used cookies to collect data about how its users search, browse, and interact with its job listings. Between March 2018 and February 2019, using Rikunabi’s data, Recruit Career—without users’ consent—was calculating and selling companies algorithmic scores that predicted how likely an individual job applicant would decline a job offer or withdraw their application.
Thirty-five companies, including Toyota Motor Corporation, Mitsubishi Electric Corporation, and other Japanese corporate giants, purchased the scores. In response to a public outcry, Recruit Career tried to excuse itself by saying that the companies who purchased the job-declining scores agreed not to use them for the selection of candidates. The company claimed the scores were intended only for clients to have better communication with their candidates, but, there was no such guarantee that’s how they would be used. Because of Japan’s dominant lifetime employment system, students feared such scores could limit their job opportunities and career choices, potentially affecting their whole professional life.
APPI: Japanese Data Protection Law
A loophole in APPI was key to understanding the Rikunabi scheme. Ironically, Japan, the world’s third-biggest economic power and one of the most technologically advanced, is the first country whose data protection law was recognized as offering equivalent levels of protection as European Union (EU) law. However, the APPI lags considerably behind EU law on cookie regulations, and the use of cookies to identify people.
Under the stronger, stricter, and detailed EU data protection regulations, cookies can constitute personal data. Identifiers don’t have to include a user’s legal name (meaning identity found on national ID card or drivers’ license) to be considered personal data under EU law. If entities processing personal data can indirectly identify you, based on multiple data, such as cookies, and other identifiers likely to distinguish you from others, that is considered processing personal data. This is what EU authorities refer to as “singling-out” to indirectly identify people: isolating some or all records which identify an individual, linking at least two records of the same individual to identify someone, or inferring identification by looking at certain characteristics and comparing them to other characteristics. The very definition of personal data under the EU’s General Data Protection Regulation (GDPR) refers to things that are "online identifiers." GDPR guidelines specifically mention that cookie identifiers may be used to create profiles of and identify people. If companies process personal data in a way that could tell one person apart from another, then this person is "identified or identifiable." And if the data is about a person, and used with the purpose of evaluating the individual, or is likely to have an impact on the person’s rights or interests, such data "relates to" the "identified or identifiable" person.
These are key elements of what is defined as personal data within EU regulation and valuable to understand this case. Why? Because EU regulation requires companies to request users’ prior consent before using any identifying cookies, except ones strictly necessary for things like remembering items in your shopping cart or information entered into forms. In contrast, APPI uses very different criteria to judge whether cookies or similar machine-generated identifiers are personal data. APPI guidelines look at whether a company collecting, processing, and transferring cookies can readily collate them with other information by a method used in the ordinary course of business to find out the legal identity of an individual. So if a company can identify an individual by asking another company to access other data to collate with a cookie and identify an individual, the cookie is not considered personal data for the company. The company can thus freely collect, process, and transfer the cookie even when a recipient of the cookie can easily re-identify the person by linking it with another data set. Under this test, companies can indirectly identify people by means of singling out without running afoul of the APPI.
The Rikunabi Scheme: Data Laundering to Circumvent the Spirit of the Law
The strategy involved three players. The first two are Recruit Career and Recruit Communications. Recruit Career is the company that operates Rikunabi, the job-search website. Recruit Communications is a marketing and advertising company, which Recruit Career subcontracted to create and deliver algorithmic scores. The third player is the one purchasing the scores: Rikunabi’s clients such as Toyota Motor Corporation.
According to a disclosure by Recruit Career, the scheme operated as follows:
Recruit Career collected data about users who visited and used the Rikunabi site. This included their real names, email addresses, and other personal data, as well as their browsing activity on Rikunabi. For example, one user’s profile might contain information about which companies they searched for, which ones they looked at, and what industries they seemed most interested in. All of this information was linked to a Rikunabi cookie ID. For the creation of algorithmic scores, Recruit Career shared with Recruit Communications Rikunabi users’ browsing history and activity linked to their Rikunabi cookie IDs, omitting real names.
At the same time, client companies such as Toyota accepted job applications on their own website. Each client company collected applicants’ legal names and contact information, and also assigned each applicant a unique applicant ID. All of this information was linked to the companies’ Employer cookie IDs. For the scoring work, each client company instructed applicants to take a web survey, which was designed to allow Recruit Communications to directly collect their Employer cookie IDs and applicant IDs connected to them. In this way, Recruit Communications was able to collect applicants’ Rikunabi cookies and the cookies assigned to applicants by client companies.
Recruit Communications somehow linked these two sets of identifiers, possibly by using cookie synching (a method that web trackers use to link cookies with one another and combine the data one company has about a user with data that other companies might have), so that it could associate their Rikunabi browsing activity with applicant IDs and single out an individual.
With the linked database, Recruit Communications put the data to work. It trained a machine learning model to look at a user’s Rikunabi browsing history and then predict whether that user would accept or reject a job offer form a particular company.
Recruit Communications then delivered those scores associated with applicant IDs back to client companies. Since each client had its own database linking its applicant IDs to real identities, client companies could easily associate the scores they received from Recruit Communications with the real names of job applicants. And job seekers who trusted their data with Rikunabi? Without their knowledge or consent, the site’s operator and its sister company, in collaboration with Rikunabi’s clients, had created a system that may have cost them a job offer by inaccurately predicting what jobs or companies they were interested in.
Why Do It Like This?
The APPI prohibits businesses from sharing a user’s personal data without prior consent. So, if Recruit Career delivered scores linked to applicants’ names, it would be required to get users’ consent to process their information in that way.
APPI doesn’t regard cookies or similar machine-generated identifiers as personal data if a company itself cannot readily collate it with other data sets to identify a person. So, Recruit Communication was, by being provided only with data disconnected from names and other personal identifiers, systematically unready to collate other information to identify individuals. Thus, under APPI, Recruit Communications was not collecting, processing, and providing any personal data and had no need to get user consent to calculate and deliver algorithmic scores to client companies.
This data laundering scheme could have been created to ensure that the whole program was technically legal, even without users’ consent. But as Recruit Career knew those client companies can easily associate the scores linked to each applicant ID with applicants’ real names, the Japanese data protection authority, Personal Information Protection Commission, found that it had engaged in “very inappropriate services, which circumvented the spirit of the law,” and ordered the company to improve privacy protections.
The 2020 APPI Amendment Closed Some Loopholes, But Others Remain
After the scandal, the APPI was amended in June 2020. When the amended law goes into effect by early 2022, it will require companies transferring a cookie or similar machine-generated identifiers to confirm beforehand whether the recipient of the data can identify an individual by combining such data with other information that the recipient has. When that is the case, the new APPI requires companies transferring such data to ensure that the recipient obtained users' prior consent for the collection of personal data. Rikunabi’s scheme would violate the 2020 amendment unless Recruit Communications, knowing full well that clients can combine the data it provides with data they already have to identify individuals, confirmed with clients before transferring algorithmic scores that they had obtained users’ prior consent for collecting their private information.
But even after the 2020 amendment, the APPI does not classify a cookie as personal data when combined indirectly with the dossiers of behavioral data often associated with them. This is a mistake. Cookies and similar machine-generated identifiers (like mobile ad IDs) are the linchpins that enable widespread online tracking and profiling. Cookies are used to link behavior from different websites to a single user, allowing trackers to connect huge swaths of a person’s life into a single profile. Just because a cookie isn’t directly linked to a person’s real identity doesn’t make the profile any less sensitive. And thanks to the data broker industry, cookies often can be linked to real identities with relative ease. A slew of “identity resolution” service providers sell trackers the ability to link pseudonymous cookie IDs to mobile phones, email addresses, or real names.