Policy Brief Big Data

Big Data: threat or opportunity?

COFACE Families Europe launched a discussion at the end of 2016 at its Berlin conference on the social impact of digitalisation, looking at six dimensions including smart work, digital literacy, the digital economy, the potential of technology in social/health service provision, connected children and safety online, and the threats and opportunities of Big Data. As a result of the conference, we produced a series of short briefs summarising some of the emerging trends and challenges. This brief focuses on Big Data.


Big Data is understood as being extremely large data sets that may be analysed computationally to reveal patterns, trends, and associations, especially relating to human behaviour and interactions. Big Data is taking on an ever growing role in our lives as more and more products and services we buy or use rely on Big Data. The reasons behind such a dependence are varied such as a business model based on targeted advertising or core functionalities relying on Big Data (like self-driving cars, traffic prediction services or health apps). While the examples of how Big Data can be beneficial to society are flourishing, this does not mean we should not consider regulating its fields of application so as to ensure that it stays within ethically acceptable bounds.

The European Institutions have agreed upon a certain number of rules regarding data protection in the General Data Protection Regulation (GDPR) which shall apply from the 25th of May 2018 in all EU Member States. Other regulations still under discussion will also impact on Big Data. For instance, the Digital Contracts Directive proposal considers that data can be a form of counter performance and therefore, a consumer which offers data in exchange for a service is protected in the same manner as a consumer paying with money.

A value greater than the sum of its parts

The reason Big Data is considered to be so valuable and revolutionary is because of the patterns and correlations it can identify across billions of data sets. This is why companies like Facebook purchase services like Whatsapp and would like to merge their data, or Microsoft with LinkedIn. By stitching together data about users, one can learn an unlimited and unsuspected number of things.

From minimal harm to outright discrimination

Big Data can be used for beneficial aims like optimising traffic or helping to prevent/cure certain diseases. On the other hand, correlations derived from Big Data can also be used to discriminate people from access to healthcare, financial services or surround them in a “filter bubble” to ensure they spend more time on social media and maximize the profits from advertising.

Arguably, if an advertisement is wrong 10% or 15% of the time, it does not matter too much. It remains commercially viable and does minimal harm. Although there are examples where women who have lost their babies at a very young age keep receiving baby ads, which puts a lot of psychological strain on them.

Another controversial example is price manipulation based on the level of battery on a users’ smartphone: users with lower battery life will tend to accept higher prices. Even more worrying, being wrong 10% of the time on credit rating or on access to health insurance can also lead to social exclusion.

From learning about behavior to shaping behavior

Big Data enables anyone to learn new information by examining correlations across huge data sets. But it can also be used to shape behavior in turn. On social media, experiments by Facebook have already shown that the mood of users can be manipulated via the information that is being displayed. In the world of work, managers are being gradually replaced by algorithms. Human managers have their flaws, their human biases and preferences, and might not treat each employee equally. An algorithm, on the other hand, can accurately measure the real time performance of all employees. For instance, by a real time monitoring of the number of customers in a store and their spending pattern, an algorithm can measure which sales-persons encourage customers to spend more, rank them accordingly, and perhaps even distribute penalties or rewards in real time. This indicator is called the “shopper’s yield” and similar indicators could be developed for any and all forms of employment: the number of emails sent by employees, the number of phone calls made, the number of words written per day etc.

As we have mentioned above, Big Data could exclude people from accessing financial services or healthcare, but it could also be used to “normalise” consumer behavior. For instance, consumers will get real-time feedback on what to eat, how to manage their money or how to drive better in order to lower their risk premium or indeed, to be able to afford a risk premium.

In short, algorithms have disciplinary power!

Artificial Intelligence’s fuel

Big Data is the “food of choice” for developing artificial intelligence (AI). AI is the “next frontier” of Big Data and algorithms: where human programmers used to design algorithms to look for specific correlations in data sets, AI can simply observe masses of data and identify correlations, patterns of behavior or anything which might be of “interest” by itself. Feeding AI with enough images of cats and the AI can identify a cat in any picture, feeding an AI with enough paintings from a certain painter enables it to transform any picture in a painting in the style of that painter.

AI relies on the unlimited and continued access to more and more data, which poses serious threats to privacy. Self-driving cars, for instance, have to be able to collect and monitor in real-time as much data as possible to be as safe as possible, but at the same time, such massive data collection may pose privacy (spying by hackers, private companies or governments) and safety (hacking and taking over the controls of a self-driving car remotely) issues.

Ethics and governance at the center of Big Data

While much of the focus and insistence is on users to “protect their own data”, or “manage the information they share”, such recommendations are unrealistic. Data is being collected often without users’ knowledge and/or consent, and can hardly be controlled.

Rather, policy makers should put in place strong and independent ethics governance bodies, with a clear judicial mandate grounded in law, to reflect on what should/shouldn’t be allowed in terms of data processing, algorithms, data collection, etc.

Much of the coming years’ developments will rely on the implementation of the GDPR and other regulation related to Big Data. The GDPR includes many principles and provisions which are very general in nature and will require interpretation by courts which will in turn build a strong jurisprudence or case law.

At the users’ level, in the event regulation fails to protect them, the number of tools available to help them protect their data and their privacy will proliferate further. Many independent NGOs and Digital Rights advocates like members of EDRI have created tools to help users protect their privacy. One example is the “Privacy Badger” plugin for Chrome developed by the Electronic Frontier Foundation which blocks certain cookies from following you online.

Another tool consists in feeding randomly generated and completely inconsistent data into services which try to collect data about a user, resulting in a technology which could completely destroy the Big Data industry by polluting data sets with false information.

Big Data will therefore either thrive if it serves the public interest, or will be fought and resisted if it fails to do so and lends itself to discrimination and manipulation.

For more information about COFACE’s recommendations, please read the policy briefing  Current challenges and the impact of digitalisation on families drafted in the context of the OECD Ministerial Conference on Digitalisation.

Download our policy brief Big Data: Threat or Opportunity?

Translate »