OkCupid Study Reveals the Perils of Big-Data Science

Tuesday, May 17th, 2016

A recent OkCupid study reveals the ethical perils of Big Data:

On May 8, a group of Danish researchers publicly released a dataset of nearly 70,000 users of the online dating site OkCupid, including usernames, age, gender, location, what kind of relationship (or sex) they’re interested in, personality traits, and answers to thousands of profiling questions used by the site.

When asked whether the researchers attempted to anonymize the dataset, Aarhus University graduate student Emil O. W. Kirkegaard, who was lead on the work, replied bluntly: “No. Data is already public.” This sentiment is repeated in the accompanying draft paper, “The OKCupid dataset: A very large public dataset of dating site users,” posted to the online peer-review forums of Open Differential Psychology, an open-access online journal also run by Kirkegaard.

Comments

  1. Slovenian Guest says:

    Update: The Open Science Framework removed the OkCupid data posting after OkCupid filed a Digital Millennium Copyright Act (DMCA) complaint on May 13.

    Denmark has strict privacy laws, you are not even allowed to enumerate products because it could identify a customer… So they got off easy with just a DMCA complaint. The “it was already stolen” defense would certainly not hold water in a court of law, not a chance.

Leave a Reply