Big Data, Ethics and Society

March 2019 Pulse 

We live in an age where we witness a truly unprecedented explosion of data – its collection, sharing and analytics. The phenomenon that came to be known as Big Data has impacted every sector of society – economics, politics, policing, security, science, education, policy, health care, public health, etc – in profound ways.

Although definitions vary, Big Data can refer to (1) the data or information collected, known as datasets, and (2) the process of analysing these ‘big’ datasets. ‘Big’ points either to the quantities and the electronic sizes of the data accumulated (gigabytes, terabytes, petabytes, etc) or to the techniques and technologies employed to analyse the data. As Brent Mittelstadt and Luciano Floridi explain, ‘The latter approach defines “big” in procedural rather than quantitative terms, by connecting the size of the dataset to its complexity, understood in terms of computational or human effort necessary for analysis.’

Big Data has to do with the volume, variety and velocity in which information or data is generated, processed, analysed and used.

In today’s world, the volume that data is generated is truly staggering. We are no longer thinking only in terms of terabytes (one million million [12 zeros]).  We are thinking in terms of exabytes (one quintillion, that is, a million raised to the power of five [18 zeros]) and zettabytes (one sextillion, that is, one million raised to the sixth power [21 zeros]). Scientists have estimated that by 2025, the Internet will exceed the brain capacity of the entire human population on planet Earth!

The variety of the data generated is equally mind-blowing. As Kord Davis points out, ‘Performance metrics from in-car monitors, manufacturing floor yield measurements, all manner of healthcare devices, and the growing number of Smart Grid appliances all generate data.’

In addition, because we leave our digital footprints and a trail of personal information whenever we use the Internet, Big Data impinges upon individual lives in unprecedented ways. Eric Freeman and David Gelernter, who coined the term ‘lifestream’, describe this phenomenon thus:

… a time-ordered stream of documents that functions as a diary of your electronic life; every document you create and every document other people send you is stored in your lifestream. The tail of your stream contains documents from the past (starting with your electronic birth certificate). Moving away from the tail and toward the present, your stream contains more recent documents – papers in progress or new electronic mail; other documents (pictures, correspondence, bills, movies, voice mail, software) are stored in between. Moving beyond the present and into the future, the stream contains documents you will need: reminders, calendar items, to-do lists.

The velocity in which we generate, acquire, process and output data has increased exponentially even as the number of sources and the variety of formats grow at an ever-faster pace. According to a report from IBM Marketing Cloud published in 2016, 90% of the data in the world today have been created in the preceding two years alone, at 2.5 quintillion bytes a day!

The ramifications of Big Data on institutions, organisations, businesses, nations and individuals are still unfolding and therefore cannot be fully anticipated at this point in time. More significantly, Big Data forces both stakeholders and society alike to re-conceptualise and recast the familiar social and ethical issues and concerns surrounding information technology. As the Council for Big Data, Ethics and Society points out, ‘Big data’s broad ethical consequences strain the familiar conceptual and infrastructural resources of science and technology ethics.’

What, then, are some of the pressing ethical concerns surrounding Big Data? I list a few in the remaining space of this article.

Privacy and Consent

The issue of privacy is invariably highlighted and discussed in the literature on Big Data ethics. Unlike the past when data collection was limited by human perception and cognition, in the era of Big Data the collection of data by information technologies is now automated and autonomous. The scope of the data has also expanded and grown exponentially over the past two decades. This unique characteristic of the age of Big Data has made privacy and personal safety an even more important and pressing issue.

Alongside the issue of privacy is that of informed consent. Traditional approaches to informed consent, where consent is sought from individuals who participate in a single study, no longer applies. This is because Big Data is designed to reveal unintended and even unexpected connections between data points. ‘Broad’ and ‘blanket’ consent mechanisms (as opposed to single-instance consent) have been suggested, but these are not without their own problems.


The next ethical issue has to do with ownership. As the European Economic and Social Committee explains, the issue of ownership ‘revolves around how to consider a user’s data that was produced after processing the original dataset: are they still a user’s data, or do they belong to the company that carried out the analyses? Or to the company that collected the original data?’

The concept of ownership is further complexified by the question of rights. This has led some to speak of two forms of ownership, as the rights to ‘control’ data, and as the rights to ‘benefit from’ data. ‘Control’ ownership suggests that the data subject has the right to restrict undesired uses of the data, while ‘benefit’ ownership refers to his right to utilise Big Data for his personal benefit.

The ownership question has sparked complex debates on ethics, legislation and public policy.

Surveillance and Security

The availability of more data and the advancement of technology has made it possible not only to track an individual, but also to generate insights into his behaviour. The ubiquity of CCTVs, positioning capabilities in mobile devices (GPS), the use of credit and ATM cards for payments and withdrawals all contribute to the surveillance and profiling of individuals.

While the ease of tracking has undoubtedly benefited society in some ways like ensuring public safety and swifter and more efficient police investigation of crimes, they have also come with a cost. The diagonal and non-directional nature of surveillance that takes place across the different levels of society can limit the liberties of members of society in subtle ways.

The second issue is security. In January 2019, The Guardian reported that data breaches in Yahoo in 2017 compromised 3 billion accounts. Other breaches include Marriott International (500 million customers), Linkedln (164 million), Sony’s PlayStation Network (77 million), Uber (57 million) and Ashley Madison (31 million). Institutions in Singapore have also had their fair share of data breaches in recent months, due to the work of malicious hackers or fraudsters.

Social Ramifications

Besides these serious ethical concerns, there are also a number of social ramifications of Big Data that should never be trivialised. One major concern is that Big Data can force a digital divide in society, thereby worsening the inequality that already exists. Digital divide refers to the difficulty that some face in accessing services delivered by new technologies due to their unfamiliarity with them.

Big Data can also lead to the de-humanisation and discrimination of individuals and groups when opinions and perceptions are formed on the basis of their digital identities (information about them obtained from different sources). The Norwegian Data Protection Authority explains that this takes place when ‘we are no longer judged on the basis of our actions, but on the basis of what all the data about us indicates our probable actions may be.’

In other words, Big Data can ‘de-incarnate’ individuals by presenting distorting abstractions and caricatures that feed prejudice, discrimination and even hatred in a process known as the dictatorship of data.

The final question that we must consider is that of epistemology. There is a tendency in both mass media and industry to adopt a disturbingly naïve approach to the ‘facts’ presented by Big Data. They work on the assumption that Big Data is ‘objective’ and that it has the ability to reveal reality without the need for interpretation and critical assessment.

According to this epistemology, data is supposed to be able to ‘speak for themselves’ and the ‘truth is already there, waiting to be discovered’, implying that there is no need for theory or hypothesis. For example, confidence is placed in ‘data-driven science’, whose authority and impeccability are judged by the amount and the compelling quality of data it presents.

This mythological view of Big Data, which, as it were, signals the ‘end of theory’, will have serious consequences if it goes unchallenged.

As we celebrate the promise of Big Data and what it can offer to society, we must also be cognisant of the dangers that lurk in the brave new world of information technology. We must be aware not only of the complex ethical issues such as privacy, ownership and security. We must also be alert to the harmful distortions it introduces to the many things we take for granted such as human identity, dignity, value and relationality, and how they may undermine our society.

Dr Roland Chia is Chew Hock Hin Professor of Christian Doctrine at Trinity Theological College and Theological and Research Advisor for the Ethos Institute for Public Christianity.