I’m at the European Conference on Complex Systems in Barcelona today, reflecting on whether big data is, or could become, a field of its own. People here are presenting on diverse topics such as financial risk, gendered decisionmaking in politics, the adoption of technology, why people imitate the behaviour of others, and information cascades. What brings them together is that they could all be described as using big data, and that they are mostly involved in a type of research that brings together knowledge, qualitative data and intuition with data-driven approaches. This describes a lot of good social science – but the difference here is that the group consists of physicists, psychologists, computer scientists, political scientists mathematicians, biologists and sociologists. So is it possible that this nexus of research, along with the other academic research currently using big data, might be seen as a field?
In favour of this is what one might term the ‘hairball phenomenon’, where you can visualise a large and complex research community discussing big data (as Marc Smith has done here) which makes it look -intuitively – as if this is growing into a field. When you enter Smith’s network visualisation and magnify its detail, you can see that the hairball incorporates people from a range of backgrounds, all around the world, who are interested in questions characterised by ‘big data’. The questions they are asking, however, are highly diverse, stretching from business (the largest knot within the hairball) to sociology.
You could contend that academic fields are characterised by shared questions based on a common body of theory. On this basis, many would argue, it’s impossible for big data to be described as a field. It could be a data analytics problem, the object of a computational approach to social research, a set of methodologies or even a methodological philosophy – but it’s not attached to a coherent body of theory and therefore is not a field.
However, just to play devil’s advocate, ‘big data’ is taking on some of the attributes of a field. It has institutional support, with universities such as Berkeley’s iSchool developing courses around it; funding becoming available to projects focusing on it; journals focusing on big data as a tool for social research, and an academic discourse. Also, as our team at Oxford has conducted around 100 interviews with the research community currently using big data, it has become clear that although many researchers rightly reject the hype attached to the name, they can also, when asked, articulate what they believe to be different about big data (e.g. it’s about understanding human behaviour from the standpoint of the universal, or about the emergent nature of phenomena, or about a high-dimensional understanding of the social), and as they do so they move into the realm of theory. The noise around big data that comes from its unprecedented practical challenges – its (relative) size, its computational aspects, storing it, visualising it – tend to mask this emerging set of theoretical perspectives.
Big data is probably still best classified as a mutation. But mutations are how new species occur. Take for example the emergence of economics in the 19th century. It occurred because of what was originally seen as a mutation: the emergence of statistics. Statistics, as a methodological and analytical possibility, drove the evolution of the field of economics through the creation of a political economy. In parallel to the increase in demand for statistics driven by finance specialists and policymakers, a research community developed which then drew in theories from other disciplines: philosophy (Smith and Mill); politics and theology (Malthus and Say); finance (Ricardo), and sociology and history (Marx). In retrospect these scholars are economists, but at the time they came from a variety of disciplines and were drawn in by the availability of a disciplinary space, facilitated by methodological developments, institutional interest and changing social perspectives.
The confluence of support, funding, research practice and discourse represents an evolving and self-reinforcing phenomenon; a political economy of big data. The network shown in Smith’s visualisation may develop its own gravity as other disciplines have done, attracting theory and forming its own academic planet. Or it may just be a hairball – only time will tell.