Data mining our online footprint: valuable insight for them, privacy anxiety for us

Written by Alice Marwick on . Posted in Features

Last week, Facebook announced its new online targeted ad platform called Atlas. If you browse the web while logged into Facebook, and chances are that you probably do considering the number of websites that use Facebook for logins and comments. The ads you encounter around the web will be carefully chosen based on information provided to Facebook. While at its launch Atlas will only serve ads targeted towards age and gender, Facebook has a myriad of personal information on their users.

This includes (but is by no means limited to) your appearance, favorite places, closest friends, schools attended, favorite movies and books, political beliefs, sports allegiances, games played or music listened to this week, and brands and celebrities 'liked' on Facebook. All of which can be used to precisely target advertising. Instead of blanketing the web with generic ad banners promising miracle weight loss treatments or work-at-home opportunities, in theory, advertisers can choose to deliver their advertising only to 25-40 year old suburban men with children, or 16-24 year old lesbian sports fans.

This level of specificity is made possible by a number of factors. First, social media encourages people to provide enormous amounts of personal information to third parties. While some see this provision as reckless, it’s hardly surprising that people discuss the same things online as they do face to face, like television shows, recent purchases or current events. The difference is that when they discuss such subjects at a dinner party or the water-cooler, it isn’t captured, aggregated, combined with other information and sold to others. That this is even feasible is due to the second factor, the popularity of 'big data' in marketing and advertising. The marketing field has been a pioneer in big data, or the analysis of large sets of information to reveal sometimes surprising patterns and connections.

What’s new about Facebook’s Atlas platform is that it adds social media data to what is already an enormous pool of personal information. Data brokers are companies which collect personal information, sort and aggregate it and then buy and share it with others. The largest company in the field of database marketing, Axciom, claims to have individual files on 700 million people. These files are comprised of information collected from a large variety of sources, including public records, surveys and warranty cards, magazine and catalog subscriber lists, consumer loyalty programs (cards given to shoppers that provide discounts to customers in exchange for tracking purchasing information), and on and offline shopping.

Data brokers use this information for micro-targeting - one data broker sells a list of men suffering from erectile dysfunction, for instance - but also to sort people into categories, or 'segments'. Experian’s Mosaic product sorts Americans into 71 different segments in 19 categories. A recent US Senate Committee report called out brokers for targeting products to 'financially vulnerable' segments, the names of which include 'X-Tra Needy', 'Hard Times' and 'Very Spartan'.

Data brokers already collect information from the internet. Visiting most websites places a small text file called a digital cookie on the user’s computer. Some of these cookies are called 'first-party' cookies, which are necessary to store usernames, passwords and preferences. For example, Yahoo! uses first-party cookies to log users in every time they visit. Third party cookies, on the other hand, persist across sites.

If Jane visits a site with an ad banner, the banner’s advertising network will place a cookie on her hard drive. However, every time Jane visits a site that uses the same advertising network, this information will be added to the cookie. Thus, the third-party cookie knows not only which sites Jane visits, but in what order, what ads she clicks on and what products she’s interested in. Data brokers engage in 'onboarding', which matches Jane’s offline profile with her online profile, adds information from her offline file to the third-party cookie and enables clients to target Jane both on and offline.

Information provided in one context can, when combined with other types of information, reveals more than the individual may have intended. This is the explicit promise of big data and why it is so enticing for marketers and advertisers. While there is nothing particularly dastardly about a grocery store tracking one's purchases to deliver coupons for favourite products, there are troubling implications when local police use this purchase information to identify possible criminal suspects (this happened in a 2004 arson case - the suspect was innocent, but had purchased fire starters from a Safeway supermarket.)

Data brokers sell information to governments and educational institutions as well as credit card issuers, retail banks and insurance companies. While data brokers admit to working with government agencies, they are tight-lipped about the extent. In the past, they have provided the Justice Department with travel records and worked with the FBI to identify suspected terrorists. Data brokers have also mistakenly sold information to criminals, notably a set of Vietnamese identity theft rings. The latter deal in what they call 'fullz', dossiers of information about an individual that includes social security number, name and address, and often credit card numbers.

Facebook already works with four big data brokers like Axciom. Companies can use data provided by the brokers to choose potential market segments, which Facebook matches with their online equivalents. But while Facebook claimed in the past that they did not provide personal information to these brokers, the Atlas deal makes this claim quite suspicious. The Senate Committee’s report noted that although they are competitors, information flows freely between the top data brokers. Moreover, Atlas is designed to work seamlessly on mobile devices, many of which boast Facebook integration throughout the user interface. Facebook’s ability to use its own information to target ads will go far beyond Facebook.com.

Recently, the US government has begun to investigate the practices of data brokers, citing the lack of transparency and oversight. Privacy organisations like the Electronic Privacy Information Center have voiced concerns about third-party cookies, data brokers and Facebook. The combination of all three is likely to set off a firestorm of concern. While modern browsers are integrating so-called 'Do Not Track' functionality, which allows people to opt-out of marketing and advertising cookies, Atlas only gives users the ability to avoid seeing targeted ads, not to opt-out of data collection.

While much of our current concern about surveillance involves government - the so-called 'Five Eyes' and the revelations of Edward Snowden - we should not ignore the role of the private sector. Increasingly, people are seeking ways to socialise online that do not require the use of their real name or personally-identifiable information. Apps like Snapchat, Whisper, Yik-Yak and Secret are popular among teenagers for ephemerality and anonymity. While Facebook is an integral part of many people’s lives, making 'opting out' difficult, it is notable that new competitor Ello and old stalwart Microsoft are both positioning customer privacy as a competitive advantage. Socialising online is really about a desire for human connection - the price should not be comprehensive surveillance.

Big Data Data privacy

Join the RSS

Join the RSS

Become part of an organisation which works to advance statistics and support statisticians

Copyright 2019 Royal Statistical Society. All Rights Reserved.
12 Errol Street, London, EC1Y 8LX. UK registered charity in England and Wales. No.306096

Twitter Facebook YouTube RSS feed RSS feed RSS newsletter

We use cookies to understand how you use our site and to improve your experience. By continuing to use our site, you accept our use of cookies and Terms of Use.