Starting to think about the discussion we are going to organise in 2014 at the Rockefeller Foundation’s Bellagio Centre on big data and social change in the developing world. Initially this seems very broad: big data could be almost anything; the developing world similarly. Our emerging project definition of ‘big data’ is data that is incrementally larger than anything people have had to deal with before, in a given field. This works fine as long as people are working on similar kinds of systems to those available in the countries where ‘big data’ has become a Thing – something to tweet about, hold events about and generally make money from. However, if you’re working in the poorest countries, almost anything becomes big data due to a lack of connectivity, storage and processing capacity. Equally, if you’re somewhere in between, there are threshhold issues. For those in Bangalore, big data may look quite similar to the way it looks in London. For those in the countryside 100km from Bangalore, it may look similar to the way it looks in Burkina Faso.
So how to define it in the context of social change in developing countries? I’m starting to think that a useful jumping-off point may be to visualise it in relation to activism and social change, rather than to try to locate it within the technology ecosystem. There’s a whole ecosystem of people working towards social change, and within that a system of people doing that work in developing countries. Within that, there’s the group using digital tools to accomplish their aims. And within that, there is a corner where people are doing that work either using, or about, data sources which are large and complex. The physicists and mathematicians we’ve talked to as part of our ‘big data in the social sciences’ research – those who have always worked with large datasets – are referring to it as ‘rich data’, i.e. data which is multidimensional, and more specifically data where those multiple dimensions involve people and their interactions.
The term ‘rich data’ suggests a qualitative aspect that is far from the Hadoop/datacube/cloud discussions that characterise this research area. It’s one that’s much closer to the reason for this gathering we’re proposing: there are new data sources either already out there or being collected now that have the potential to influence the way richer countries ‘do’ development – but also to influence the way people in developing countries engage with development, hold their own governments and foreign ones accountable, and pursue their rights as citizens. For instance, institutions are leaping on the still-nebulous promise of big data – see this blog post from the head of the UN’s Global Pulse on ‘data philanthropy’. The idea is also giving rise to efforts that are somewhere between CSR (Corporate Social Responsibility) and Open Data initiatives, such as the D4D initiative by Orange.
With the help of organisations such as the Centre for Internet and Society in Bangalore and the Open Data Research Network, we’re starting to put together ideas on how what a useful discussion of big data and social change in a developing-world context might look like. This is particularly good as a new perspective for me: my research has focused on how the internet is reinvented and translated as it is brought to new areas of the world, so for me the challenges of ‘developing countries’ have been those of the poorest, where just getting more than a dialup connection can be an all-consuming challenge. This discussion will be located where developing-world concerns overlap with those of industrialised countries: around privacy, identity, accountability and censorship, along with how new sources of data may spark ideas among activists for new ways of reading the digital environment or organising their activities. These are all issues which have become the stuff of daily life in industrialised countries, not just for activists but for citizens in general.
There is no reason to separate out industrialised from developing countries in terms of the development of digital citizenship, and digital challenges to citizenship. We are just at different places along the spectrum of engagement with these issues, and there may be much to be gained from connecting the different parts of the spectrum, forming feedback loops and sharing ideas. For anyone who doubts that, check out the difference between this release of mobile phone data and the release referred to (obliquely) in this European research. If the data relates to an industrialised country, even most of the researchers who work on it generally won’t know which country, which company or what period the data relates to. Whereas for usage data collected by a company in a developing country, the rules governing release for research purposes appear to be very different.
And, as mentioned at the start of this post, there is the question of what constitutes a developing country. All I have to offer on this is my favourite map, which just offers a different perspective on how to decide which bits of land are centre and which are the periphery. If you figure out who’s most important, be sure to let me know.