On 13 May 2019, the Data Ethics Special Interest Group (SIG) held its first substantive event, on ‘Characteristics of Good Data Governance’. As part of the imperative to engage audiences outside London, the SIG held the meeting in Newcastle and chose speakers from Newcastle, Edinburgh and Southampton. Sponsorship came from Connected Health Cities (North East and North Cumbria).
Introducing the event as chair, RSS executive director Hetan Shah spoke of the emerging concerns about data ethics, ranging from predictions about social categories of people using data, to collective governance mechanisms.
The day was then split into three sessions, covering health; people and practice; and to kick-off the day; regulation and stewardship.
Peter Wells of the Open Data Institute spoke first, about the work on the government-sponsored data trusts and the collaborative enthusiasm which saw three pilot projects. Two of these (city data and illegal wildlife trade) will be going forward to further phases of development.
Catherine Bromley of the Office for Statistics Regulation spoke next on OSR guidance on Data Governance which was published in 2018. It is underpinned by a vision for data to be used for public good and part of the ongoing work on data linkage which saw the ‘Joining up Data’ report last year.
Finally, Max Tse spoke about the work the National Audit Office has been doing, which identifies data architecture as one of themes of neglected governance in large projects. Management information needs to be planned and critically reviewed in order to retain control over the progress of major projects but is not given sufficient priority in planning or senior staff capability.
Discussion drew out issues of language, specifying terms such as trustworthiness, the infrastructural nature of data and the need to engage more people at senior level. All three speakers indicated work was ongoing and further outputs were planned in the near future, but coordinating interest was challenging given limited media understanding of the complexity of data and its cross-cutting nature.
The second session featured three speakers from Newcastle University. Madeleine Murtagh spoke first, describing the data governance framework in the METADAC infrastructure and the importance of engaging with communities. She gave examples of projects such as the Great North Care Record where this had been done. Paul Burton followed by outlining a technical project, DataSHIELD, an infrastructure that enables the remote and non-disclosive analysis of sensitive research data. Lastly Wendy Craig talked through some of the implications of Data Protection Regulations, highlighting the complexity of health. A new body, NHS X, offers potential to lead and coordinate in this area, where the principles of GDPR have been very constructive.
Discussion covered some of the challenges of putting these proposals into practice. Anonymisation is at the heart of most practice in securing data, but it is both impractical (it can restrict use of data beyond anyone’s wishes) and not a panacea (risk of disclosure is reduced but not eliminated) - group approaches are favoured, but much more nascent and unfamiliar.
The final session focused on people and practice. Kieron O’Hara (University of Southampton) discussed aspects of data trusts and of trust in data. He gave examples of some practical applications of data sharing on new models and described the need for 'virtue ethics', where data science and statistical professionals did not just take a compliance-based approach but worked in the interests of citizens.
Robin Rice (University of Edinburgh) described the services for researchers delivered by librarians in collaboration with university IT services. Data is recognised as a resource that requires management plans, which work well around the people involved but get more difficult when costs for higher security is required.
Steve Caughey concluded the meeting by outlining the work programme of the National Innovation Centre for Data (NICD), based in Newcastle. Its vision is to develop capability in businesses by helping them develop IP and the skills of their staff, as well as having a wider impact on the pipeline of data scientists both in the North East and nationally.
Discussion covered how well this is developing in practice, including the shifting skills of new researchers, and the challenge of recruiting women into data science. While statistics is more diverse than computing, more is needed throughout the education system. It is clear that different models are emerging to use and share data but there is not much available to synthesise good practice.
As a new grouping, the Data Ethics SIG is looking to develop its agenda in partnership with others and will be collaborating with the Faculty of Public Health on a meeting on 3 July. Data governance is a clear area to focus on, and professionalism will be a feature of ‘data ethics day’ at the RSS conference on 4 September.