sharing big data: let it be complicated.

I’m at the Internet Governance Forum 2013 in Bali this week, which kicked off this morning with a discussion on ‘Growth and user empowerment through data commons’ The panel was myself, Alan Paul of the World Economic Forum, and Amparo Ballivian of the World Bank’s Open Data initiative. The panel addressed the issue of how big data and open data could be important in different types of country, at different national income levels. The headings ‘growth and user empowerment’ do not necessarily belong together, and in fact may be specifically in conflict, so it wasn’t surprising that the panel addressed them separately. We discussed how big data was currently contributing to low and middle income countries’ development (answer: not hugely, but there is great potential to digitise huge mounds of locally collected and stored written data and use it to complement, fill out or replace national statistics on issues such as poverty, education and health). We also discussed the importance of privacy in this field, and how governance of data sharing should be structured given the complex nature of cross-border data sharing ‘for development’.

The discussion centered around three topics. First, in order to answer questions about advantages, drawbacks and privacy concerns, a taxonomy of data needs to be established so we know who is implicated, where regulatory accountability and enforcement may land, and how/when to address privacy concerns. The taxonomy I suggest distinguishes data that is understood and whose meaning can be fairly stable (data about things) from data which requires contextual understanding in order to attain its full value (data about people). The latter is potentially much more valuable, but is precisely where the governance problems lie. Amparo Ballivian suggests that the ‘data revolution’ must not be a ‘data evolution’ where new data is treated like previous types of data (household surveys and other national statistics collected at first hand). She proposes that to handle, govern and use the new types of data becoming available, and particularly to turn them into Open Data, it is necessary to involve academics, data scientists, and other experts rather than merely channelling personal data as if it were national statistics.

This leads to the second main topic of the discussion: producing ‘big’ data is a collective act, a collaboration between individual technology users and technology providers. Once it is treated as collective, rather than trying to attribute ‘ownership’ of data in a unitary way (users vs companies), governance becomes more complex but also more possible to conceptualise. Thus part of the current challenge is recognising and allowing the problem of privacy and data sharing to be as complex as it really is, rather than trying to fit it into current governance boxes, which is what happens when the current questions are asked – is it a corporate regulation issue, a national legal issue, an international institutional issue, a personal responsibility of technology users or data scientists?

This implies the third topic of discussion: big/digital/open data is a cross-border phenomenon, in quite a complex way. The best governance approach may be a multi-level one, where data sharing is regulated under corporate home-country rules, user-country frameworks and agreements, and with international-level oversight. Most of these structures are not yet in place, or are not enforceable – especially the last – and it remains to be seen whether the IGF can come up with a constructive discussion towards their creation.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: