Beware the rise of the data oligarchy

The Bank of St George, founded in 1407 in the Italian Republic of Genova, is one of the world’s oldest banks. It was so powerful that it governed many of Genova’s possessions on the republic’s behalf. This power was based on accumulated capital. The power of accumulated capital can still dominate international affairs, but a new form of power is also emerging, that of accumulated data through loyalty cards, text messages, credit card transactions, web browsing and social networking. Data is the new currency.

Where does this power come from? Cross-linking of different data sources can give deep insights into personality, health, commercial intent and risk. The aim now is to understand and characterise the population, down to the individual level.

Personalisation is the watchword, with targeted search results, social network news feed, movie recommendations and adverts. The upside is good: we could target specific health recommendations, give better treatments and earlier diagnoses for disease, and provide better support to the elderly and otherwise incapacitated people.

But there are also major ethical questions. For banks it is clear, we own the money we deposit there. They only hold our capital under license and pay us directly for using it. Ownership of data is less clear. In the past acquiring data was expensive: it required questionnaires and manual collation of information. Today we leave digital footprints in our wake, and acquisition of personal data is relatively cheap.

Our personal data reflects our interests, our desires: it is a digital extension of our soul. By giving it away so freely to social networks, supermarket loyalty cards and internet search engines have we engaged in some form of Faustian pact? Our digital souls may not be immortal, but they can certainly outlive us. What we choose to share also affects those around us: my wife and I may be happy to share information about our genetics, but by doing so we are also sharing information about our children’s genomes. Using a supermarket loyalty card gains us discounts on our weekly shop, but also gives the supermarket detailed information about our family diet.

Machine learning is at the heart of the current revolution in artificial intelligence. A major aim of the field is to develop algorithms that better understand all of this data. Already machine-learning techniques are used to recognise faces or make movie recommendations, but as we develop better algorithms for aggregated data, our understanding of the individual also improves. There have been calls from Elon MuskStephen Hawking and others to regulate artificial intelligence research. They cite fears about autonomous and sentient artificial intelligence that could self-replicate beyond our control. Most of my colleagues believe that such breakthroughs are beyond the foreseeable horizon of current research. Sentient intelligence is still not at all well understood.

We need better models of data ownership. When interacting with banks we can withdraw our funds, but for data repositories we have no right of deletion. We need a better understanding of how our data is being used already and how it is likely to be used in the future. There are opportunities and risks with the accumulation of data, just as there are for the accumulation of capital. However, one thing seems clear: we need to increase the power of the people. Banks pay interest; perhaps we should be paid directly for the use of our personal data. We need to be made aware of the value of our data and be given rights to control who accesses it. We need to form a data-democracy: data governance for the people, by the people and with the people’s consent.

Neil Lawrence is a professor of machine learning at the University of Sheffield. He is an advocate of open data science and an advisor to a London based startup, CitizenMe, that aims to allow users to reclaim their digital soul