Do you know what happened to your data?

Written by Sam Smith on . Posted in Features

The way data is being put to use continues to explode. From purely statistical research in the past, to the ‘big data’ bubble, data is getting copied and reused in novel and interesting ways. All it seems to need is an idea and a dataset, and the two don’t even have to be connected for government to show interest.

At least three of the five priorities of the government’s forthcoming comprehensive spending review are likely to have large data components. This will set the funding for government bodies for the next five years and is the underlying basis for all decisions made during this parliament. At the same time, the data trust deficit is as big as ever, and growing as policy makers just keep on digging.

However, a strong independent statistical system, has led to an Office for National Statistics (ONS) that is viewed separately from government. The ONS remains a comparatively trusted party, and as the public face of statistics, is in a position to say what proper data handling looks like. The ONS has long required that data should not leave the control of the statistical bodies. The rollout of safe settings by the Administrative Data Research Network to institutions accessing remote data services in a controlled fashion is the next step in scaling this proven model.

As much as care.data in the NHS has epitomised the collapse of public confidence in data handling, it's the ONS that has provided the blueprint for how to rebuild that trust. HSCIC is now moving towards a safe setting for all individual level data and transparent reporting on where patient data goes.

This will be entirely familiar to those who have been using census microdata for over a decade. It's reflected in broad public confidence in the census, supported by the ongoing actions of the ONS to maintain that public trust, even in the face of other more short term interests.

The statistical system can’t rebuild trust in data singlehandedly, but it can offer a mechanism for those bodies that see a problem and wish to be part of a solution. Statisticians cannot impose a solution, nor do they have to. The ONS just has to offer a platform and mechanism for those across government who are willing to tell the public how their own data is used.

The ONS tracks how data from their surveys are used and for what purpose. Those individuals with a particular interest in any survey run (whether they were in it or not), can simply opt for notification of how that survey/wave/etc. was used.

How government departments increase (or decrease) citizens’ trust in their handling of personal data is not substantively within the control of statisticians. Policy wonks will keep wonking, so bad ideas keep returning. What those who want a trusted data environment can do, is say to citizens that it is possible to have a mechanism whereby you can find out how your data has been used.

The mechanism makes no promises on truth, nor completeness – those exclusively belong to the participating organisations – but if the mechanism knows, then the citizen will be told. For the multitude of government services that use GOV.UK Verify for access, they could be linked via pseudonyms to a reporting function at ONS. The ONS can then use a link from the participating services (again, via Verify) to provide a report for citizens on request.

In 2012, the Department for Education wanted to ‘open’ the National Pupil Database – a linked database of the school experiences of every child in the country, from pre-school to university. This was done for other reasons, but opened up a can of worms.

One particularly large worm was that parents didn't know it existed. The first they heard was that data would be available to entities such as start-ups in order ‘to maximise the value of this rich dataset’. At first, the government were distinctly unwilling to provide a list of who had previously accessed the data. They were defending their processes and didn't want to provide ammunition to critics, who were already concerned.

When they finally provided a list following a freedom of information request, they later said it was the best thing they could have done. Those with concerns could read the project register. On those which raised small questions, they could look at the published papers of other work by the same group in the same field. While the concern about potentially new uses remained (and those uses were not permitted), the increased knowledge raised trust in the process and the way the rules were interpreted.

Telling citizens about every way government uses their data is the only way for the current broad ‘data sharing’ standoff to be resolved in the long term. Data sharing can do good, but it can also do significant harm, and there's no real way for central government to know the difference in advance. It will come down to knowledge, or the capability for knowledge, on the part of citizens.

If the spending review provides millions for new data projects, coupled with no intention of a mechanism telling citizens about what they're doing with their data, it will raise questions about what departments have to hide.

However, this has to be as easy for departments as it is for citizens. Providing as many data usage reports to citizens as there are departments is a duplication of effort and resources. This is what ‘government as a platform’ was intended to avoid. So who will provide the trusted third party relationship between the citizen and a department, with whom they may have fundamentally untrustworthy relations? The governance of creating such a report is important, if produced by the ONS, this should be an individual report empowering the citizen. If produced by the Home Office, this is a dossier on every citizen in the country.

You may think it's not the statistician's job to govern all data usage within government. You're not wrong, but someone has to provide that assurance to the citizen. If the statistician is told, then the citizen will be too, which is within the remit of the UK Statistics Authority.

It’s the statistics profession that have a strategic interest in high quality data for the long term. Departments can maintain their fictions – it’s statisticians who wish to measure the real world.

When the policy buzzwords move on, and something other than evidence informs policy, you will still be here. Whatever the latest fad may be, it’s the belief in truth, evidence and statistics that lets statistical work continue.

Knowing how data and statistics get produced is in every statistician’s interest.

 

The views expressed in the Opinion section of StatsLife are solely those of the original authors and other contributors. These views and opinions do not necessarily represent those of The Royal Statistical Society.

Data privacy Data sharing

Join the RSS

Join the RSS

Become part of an organisation which works to advance statistics and support statisticians

Copyright 2019 Royal Statistical Society. All Rights Reserved.
12 Errol Street, London, EC1Y 8LX. UK registered charity in England and Wales. No.306096

Twitter Facebook YouTube RSS feed RSS feed RSS newsletter

We use cookies to understand how you use our site and to improve your experience. By continuing to use our site, you accept our use of cookies and Terms of Use.