Several times over the last year, researchers have written about the notion of data justice (Heeks and Renken here, Johnson here, and Dencik, Hintz and Cable here). They use this terminology to bring together important concerns about the way that data, and big data in particular, are affecting society, politics and development. These authors all focus on different aspects of the idea of data justice, and all contribute important insights. I did my best to think through how they could fit together, in this paper.
Taking these other insights into account, I came up with three concerns that seem central to data justice: visibility through data, engagement with digital technologies, and nondiscrimination. These aspects can be brought together in a way that challenges current approaches to regulating and using digital data on the societal scale. In brief, here’s how it looks:
As is obvious from this diagram, there are elements of this proposed notion of data justice that work fine with current laws and frameworks. The right to be counted and the right not to be discriminated against, for instance, are expressed in laws around the world, and although we might debate the details of what is to be protected, we have a solid idea of what they mean.
Others, however, pose more of a challenge. The right to be counted and represented does not combine well with the freedom to be invisible to the extent we choose (both contained in ‘visibility through data’). Is it right, for instance, that people should be able to opt out of specific kinds of data collection and resale? Consider the massive scale of the secondary data market: virtually everything we do using digital devices is tracked and that information is sold on into a data market that is worth billions of dollars. What kind of regulation apparatus would be necessary for people to be able to selectively opt out? As importantly, what are the civic implications of that, now that governments around the world are experimenting with commercially collected big data?
This complexity means that the right to be counted is starting to conflict with the right to privacy in entirely new ways. Data protection laws protect our data, but with the assumption that it is going to be sold on and we don’t object to that. EU data protection regulation, for example, says we should be able to track what data is held about us and know what is happening to it, but not that we can actually opt out of having it used commercially at all. Plus even those legal protections are completely useless when my data travels across borders into the global market. So if I don’t want my online search history, or my Facebook ‘likes’, both of which tell the most intimate and specific story about me, to be available to any company willing to pay for them, there is currently no legal apparatus that can help me. The only way to go is to stop using the services entirely, or to switch to Tor, which is a great invention but doesn’t, for example, work with full functionality on my phone.
data protection regulations don’t envisage that we might want to opt out, even selectively, from having our data used
And here we come to the freedom to determine one’s own use of technology. This seems simple: you choose what phone you use, you choose whether you go online, and how. Well, not so much. It turns out we are using a lot of technologies we don’t necessarily want to, or don’t think of as a choice because we aren’t given the opportunity to opt out. If you live in a high-income country, for example, it’s likely you have to use a digital system to connect with your government to pay your taxes. Yet those systems frequently expose your data to misuse and fraud. If you live in the UK, your hospital admission data may now be shared with Google, in a way that does not provide for people to opt out.
In India the scenario is more serious: it is now compulsory for the poor to use Aadhaar (a quasi-governmental ID system) and its attached e-commerce apparatus because the government has decided to make all small transactions (i.e. any transaction by someone poor) cashless. You can read Silvia Masiero’s excellent analysis of this here. Similarly, anyone walking down the street in a city is effectively ‘using’ data technologies set up by the city and paid for by their tax money, whose job is to monitor behaviour and interactions with city infrastructure. All of us who work in smart buildings that can tell when we enter or leave, what we do in our offices, and how we are behaving, are ‘using’ those technologies – just not voluntarily. Where is the choice in this landscape?
So here is the question that arises from these problems: if we acknowledge that many policymakers believe ‘data is the new oil’, what is our plan for alternative energy sources?
In the 1850s, when petroleum oil was (re)discovered and put to commercial use, it replaced another oil, which has almost been forgotten today. Before petroleum, people used whale oil to light their lamps and cities. Before we relied carbon deposits, we relied on whales. And just like carbon deposits, whales eventually proved problematic as an energy source. For one thing, the supply of whales turned out not to be infinite: one factor that spurred entrepreneurs to invent the oil refining process was that we were starting to run out of whales to boil.
if we had known that whales were not an endless resource, or that global warming would occur, might the history of the energy sector look very different?
So oil became the ‘renewable alternative’ to whales. And today, we are having to invent renewable alternatives to oil. But imagine if we had known that the whales would run out, or that global warming would occur. Would we have invested in developing multiple options at the same time, and might the history of the energy sector look very different?
If we can imagine an alternative past where we were aware of the drawbacks and risks of our choices, can we also imagine an alternative future for data, not built on surveillance? What kind of framework would allow us to plan – by rejecting or opting into – secondary uses of our data? This might include donating data about ourselves in a limited way to specific good causes such as genetic data banks; devising ways to control the data profiles our children build up before they are old enough to know their implications; and setting up agreements between people, government and companies for the kind of data use we would like. Civic trusts are one model for this kind of transaction, but we need a lot more.
This is to suggest that we should not be comfortable just living from one day to the next in terms of the innovation and energy we rely on, waiting for untenable problems to emerge that will demand we think of something new. We can learn from history, if we choose to. Perhaps big data is where that should begin.