In a recent paper, Dennis Broeders and I suggested that in the era of big data, we see rather than read. States have traditionally gathered data on their citizens in person, through survey methods, and have used those data to inform policymaking. James C Scott, in his book Seeing Like a State, refers to this as a process of making people legible – the state organises and sorts its citizens through categories, groupings and physically moving them around in order to make them more easily governable. This kind of making-governable, however, also includes an element of reciprocity. If you are connecting with citizens in order to change their behaviour, you are also opening up the possibility of feedback and representation.
Big data, however, is changing this. By connecting new populations to the internet and mobile phones, technology companies are often collecting detailed data on groups that have not previously been well-surveyed using traditional methods. People who may not have been legible to the state, for example pastoralists or people in informal settlements, are now visible. Yet this visibility does not contain the possibility of representation. It is fairly obvious that if you become visible to Google, Facebook or any other of the technology giants, this does not mean your presence or your needs will be reflected back to your own government.
Big data can also be used to create new maps. In the paper, we also refer to the notion of ‘shadow maps’ and ‘state data doubles’ – the idea that with the location details generated by people using electronic communications, and with sensor data from satellites and drones, the ability to map is being decentralised to new organisations and people, with the potential arising for multiple ‘shadow maps’ – unofficial mappings of territory and land use that do not connect with state cartography.
Together, these territorial mappings and human visibilities add up to the ability to create ‘data doubles’, an idea from the surveillance literature, denoting the way that we all have digital selves that are recreated through data. These selves used to be created for the use of government bureaucracies, for example for purposes of taxation, law enforcement or public health. With the advent of big data, however, these data doubles can be created by firms, and even by individuals if they have access to data about us. They can also be taken up to the state level, where Google knows more about a country than its own government. In the case of France or the US, this is not necessarily problematic because the government knows quite a lot already. In the case of a country such as Angola, however, which did not conduct a census between 1970 and 2014, this puts Google quite some way ahead.
All this raises the question of how these new data sources may be used. They offer immense potential to map, chart activities and development, and to inform national and international authorities for beneficial purposes. On the other hand, if they are misused or misappropriated, they offer immense power to know and see, but also to influence. One facet of this power is the expansion of markets: people may be translated from a fairly marginalised existence to being identified by tech giants as potential consumers, without necessarily gaining a better connection with their own government along the way.
Another facet of this power is the ability to track, to monitor and, potentially, to delete those data points that pop up as troublesome or undesirable. This leads to the question of how big data may change the tactics of oppression. Nathaniel Raymond of the Harvard Humanitarian Initiative has written of how a satellite project to monitor violence in South Sudan in fact created more precise cartographies of violence that also allowed hostile parties to map out potential victims. In the field of epidemiology, the discussion around the recent Ebola epidemic tended to leave out the fact that the level of detail in the data available from mobile phones in many West African countries is just enough to enable mass quarantine on an indiscriminate basis, rather than actually track who has the disease and treat them. There have been proposals by large tech firms to apply big data analytics to the refugee crisis on the borders of the EU in order to track refugees on the presumption that they are a security risk and must be prevented from criminal behaviour.
Beyond an unprecedented ability to see group dynamics, big data also tells us about networks and community formation. It can tell us who influences whom, who is in contact with whom, and it can also tell us in real time how communities are forming around particular ideas or activities. This is fascinating when applied to the global spread of awareness regarding zombie activity, but potentially ruinous if you are an Iranian dissident whose social network privacy settings have just been set to ‘open’ by well-meaning utopians at Facebook headquarters.
What, then, does big data mean for violent and authoritarian regimes? In some cases, very little. Much large-scale violence (Rwanda’s, for example, or Kenya’s) actually takes place amongst neighbours and has little need of datafication to identify targets. Big data does however lend a hand to those who want to be more systematic. Its potential for surveillance is immense, as is its potential to sort, categorise and group. If you want to target only those who are influencers in the field of political activism, for example, then social media data can help. If you want to perform a systematic analysis of groups living on land you would like to claim, and know where to find them at a particular time, then mapping digital traces and signals is the way.
What big data does, above all, is make it possible to do things remotely that previously had to be done in person. Digital traces can, for example, help you target your enemy in urban environments who would otherwise be impossible to pick out without inside knowledge from spies and infiltrators. The use of big data is unlikely to reduce the civilian deaths that occur in urban warfare situations, however, since the kind of authorities that end up fighting urban guerilla wars tend not to prioritise their citizens’ safety in the first place.
It also makes it possible to identify without identifying. What Luciano Floridi terms ‘ontologically constitutive information’ – data that can identify you without naming you, such as your location details over time; your internet searches or your online social activities – can be used to make groupings that reflect people’s activities, beliefs and preferences in much more precise ways than, for example, the map of Jewish Amsterdam commissioned by the Nazis. So one possibility is that big data will make violence and oppression more targetable and specific.
The problem of being able to identify people without knowing their names may also mean that our new data technologies could make people unsafe in entirely new ways. The largely faith-based standard of anonymisation plays a role in this – faith-based because it is a central element of all privacy policies and guidelines, but is almost universally agreed to be ineffective over time given what Bruce Schneier calls ‘the declining half-life of secrets’. Over time, linking and merging the new datasets that emerge about a given population will almost always lead to the possibility of reidentifying those whose records have been anonymised. Anonymisation is problematic because it creates the impression that it is safe to share data, whereas some kinds of data should not be allowed to spread at all. For example, the current reports of a case where refugees were registered in a camp’s database, then that data was anonymised and shared with the refugee organisation’s partners for purposes of service provision. The data showed that a certain number of people in the camp were unaccompanied female minors. Somehow along the line the data got out, and the fact that there was a number made those girls targets for a (successful) trafficking attempt.
Despite the well-known satellite images of remote missile strikes, most violence is fairly local in nature. It is perpetrated in messy, politically complicated situations where people are violent towards people they can see in front of them – even when it does feed off new communications technologies, as with as the ISIS and Syrian government fighters who take mobile phones away from those fleeing the war to look at their social media profiles and figure out which side they support (as current reports from Syria claim).
By increasing the possibilities and accuracy of remote violence, however, big data may change the field in some ways. It provides both new ways to make people visible, and new ways to visualise their environment. Data analytics are a new kind of pivot-point where data about people meets those who wish to use it. Big data gives power to those who can position themselves at that pivot point, either by intercepting data flows or by becoming legitimate recipients of them by, for example, making agreements with tech companies or by forcing them to give up data. Or just by watching for the moments where tech companies make a mistake and open up everyone’s privacy settings to public.