How to draw people with data – models, reality, and the riddle of the Sphinx

In Sophocles’ play, Oedipus became king of Thebes by solving a riddle posed by the Sphinx: what goes on four feet in the morning, two feet at noon, and three feet in the evening? Oedipus guessed correctly that it was a human: first a baby crawling, then an adult walking, then using a stick in old age. When asked how he did it, he replies ‘neither birds nor sign from heaven … mother wit, untaught of auguries.’[1]

This came back to me recently at a conference on systems research, where a series of highly complex models were presented, designed to explain the way systems such as social systems, urban configurations and value chains behave and scale, and how modelling techniques can  contribute to emergency planning for natural disaster or urban evacuation scenarios.

One researcher presented his team’s semantic model of a national energy system: well-constructed and thought out, and highly elaborate. However, since it describes a socio-technical system with huge implications for a country’s functioning and security, the researcher has never been able to get the data to populate it. In discussion with the other complex systems researchers there, I learned that ‘ground truth’ – actual, rather than theoretical, data on human or systemic behaviour – is so hard to come by that most models of complex systems remain relatively or completely untested. A senior researcher told me, ‘there’s no comparison for a model if we don’t know the ground truth. Is it true? We don’t know. It’s as true as any other model.’

Why is the data missing? Well, often (as with the energy use model) the ‘mother wit’ referred to by Oedipus is data collected by firms or governments and therefore proprietary, sensitive or both. Sometimes it doesn’t exist yet because no one has collected it – for example, data seldom gets collected on people’s behaviour during disasters because authorities are too busy responding to do a time-and-motion study. Big data may change the latter problem, however, given that people with mobile phones and other devices now emit data which is collected automatically regardless of what else may be going on (e.g. the data used in this study of people’s behaviour directly after the 2011 Oslo bombing).

The attempt to model complex phenomena with regard to people’s behaviour – figuring out people through data – is a practice of solving riddles. A model poses a riddle: it quantifies something which we experience as qualitative, such as a human being, and forces us to go through a different kind of thought process to identify it. It’s a form of helpful alienation, an exercise in seeing beyond our accustomed perspective. Besides being a question, the Sphinx’s riddle is also an answer to another one: How does a human being look to the gods?

Models force us to adopt, as Sandy Pentland calls it, ‘the god’s eye view’. But the god’s eye view needs to be accompanied by the human perspective because it has a history of being incomprehensible and unforgiving – just ask Job, or Oedipus. In order to make models of social phenomena speak a language we can understand, we have to add in social theory and qualitative information. Christopher Barrett does this in his agent-based model of people’s behaviour after a (hypothesised) nuclear explosion: what can existing data tell us about how people evacuate an area under conditions of extreme danger? The data say that people don’t behave rationally, but instead run towards ground zero because they have family members or friends there, or refuse to stay in a safe place because they don’t have information on what’s going on outside. The study’s conclusion is that rapidly re-establishing communication networks is as important as advance information and planning: if people can contact their friends and family, they are more likely to help others nearby, which leads to better outcomes overall. It’s easy to see how big data can help inform this kind of model – data collected from people’s cellphone signals during the aftermath of an earthquake, for example, could help understand more about the irregularities in behaviour and how they related to communications and social ties.

The model can never be perfect. What if the area in question includes a religious centre housing people with an unusual level of altruism? What if it has a high share of deaf or blind people? What if people there are extremely high in social status and decide to wait for their helicopters instead? Unlikely examples, but cities are (as the modellers observe) complex places where a large number of possible outcomes are in play.

So who has the data? Data scientists have the ability to make important riddles. They can figure out surprising regularities about the way systems are articulated, and can model things so we can look at them in a new way – and they may even be becoming kingmakers (Nate Silver?) – but ultimately riddles are solved not by throwing a lot of data at them, but by the qualitative databank of knowledge deriving from experience and understanding. So the question becomes, how should connections be built between modellers and social scientists so that rather than siphoning information from published analyses and datasets, the two groups can actually work together in real time? Institutional openness is key – for instance at the University of Duisberg-Essen, cultural theorists are consulting with urban systems modellers because they are interested in each other’s work and because the structure of their research environment encourages it.

This suggests that the most successful environment for modelling the social world is one where disciplinary boundaries are highly porous and research and practice are aligned according to skillsets rather than disciplinary silos. It’s hard to say how this can be achieved within the current research model, however. It’s common for research grant programs to incentivise ‘interdisciplinary research’ – but in order to apply for a grant at all, researchers have to climb a particular disciplinary ladder, working in a particular department and citing that discipline’s body of theory and current debates. In this system only the most successful within a discipline can win the freedom to be interdisciplinary. This means that most interdisciplinary researchers are not doing basic research but instead end up in policy or practice. There are some good examples of gradual change in the system, such as the Oxford Internet Institute or Leiden’s Centre for Innovation, but it’s not enough if researchers still have to negotiate disciplinary hierarchies in order to get to a place of greater freedom. When every university has a team of researchers who spend their time reading novels collaborating with a team who spend their time building models of complex systems, the world will be a more functional place – and we’ll all have an easier time getting out of town if something bad happens.


[1] Storr, F. (Ed.). (1912). Sophocles (Vol. 1). W. Heinemann.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: