"Protecting Identities of Panelists in Market Research"

According to a study conducted by researchers at Cornell SC Johnson College of Business, it is highly likely that the identity and other sensitive information of a survey participant can be traced back to the individual. When organizations release or share data, they comply with privacy regulations, meaning they are suppressing or anonymizing Personal Identifiable Information (PII), according to Sachin Gupta, Ph.D. '93, the Henrietta Johnson Louis Professor of Management at SC Johnson College. Organizations believe they have protected the privacy of individuals they are sharing data about because they have gone through the suppression and anonymization processes, but this may not be the case because data can always be linked with other data, according to Gupta. Gupta and colleagues argue in a new paper, "Reidentification Risk in Panel Data: Protecting for k-Anonymity," that nearly all market research panel participants are at risk of being de-anonymized. Personal information like a person's name, date of birth, email address, and other identifiers are floating around in cyberspace, ready for the taking by a highly motivated individual or company. Gupta and colleagues cited a 2008 paper by a pair of researchers from the University of Texas, Austin, who developed a de-anonymization algorithm called Scoreboard-RH, that was able to identify up to 99 percent of Netflix subscribers by using anonymized data from a 2006 competition aimed at improving its recommendation service, combined with publicly available information on the Internet Movie Database. That study, like Gupta's, is based on quasi-identifiers (QIDs), which are attributes common among anonymized data sets and publicly available data sets and can be used to link them. The conventional measure of disclosure risk, known as unicity, is the proportion of individuals in a given dataset with unique QIDs. K-anonymity is a popular data privacy model aimed at reducing the degree of uniqueness of QIDs to protect against disclosure risk. Gupta and his colleagues have created "sno-unicity," which is snowballing unicity that presents worst-case-scenario reidentification risk, as it iteratively collects individuals who can be uniquely reidentified by at least one of their multiple records. This article continues to discuss findings from the study on the reidentification of individuals in panel data. 

Cornell Chronicle reports "Protecting Identities of Panelists in Market Research"

Submitted by Anonymous on