RSS sections and groups meeting reports

Social Statistics Section report: Linking health care records for statistical analysis: care.data, possibilities and problems

Written by Web News Editor on . Posted in Sections and local group meeting reports

On October 7, the RSS Social Statistics Section hosted an event to discuss the recent debate about the role of the Health and Social Care Information Centre (HSCIC) in extracting and making available linked GP and hospital records as part of the care.data programme. Care.data has been the focus of continued media attention due to a number of political and ethical issues raised by the proposed scheme. Inevitably, this meeting, chaired by Harvey Goldstein (professor of social statistics) at Errol Street, attracted a very full audience.

Four speakers gave presentations relating to practical, technical and privacy issues surrounding the proposed linkage of patient records for care.data: Chris Roebuck (director of benefits and utilisation at the HSCIC), Katie Harron (Research fellow, UCL), Liz Little (director of business development, Dr Foster) and Alison Macfarlane (professor of perinatal health, City University London). The audience were then invited to participate in a lively debate.

Chris Roebuck began by giving an introduction to the HSCIC and presented some outputs of analyses resulting from existing data sources, including analyses of A&E attendances by age (which was derived from patient date of birth). He explained that the vision for care.data was to optimise services by bringing together primary and secondary care data (extending to social care data in the future). Chris went on to discuss the steps that the HSCIC had taken to address concerns raised after care.data was originally proposed, including a listening exercise involving patients, GPs and healthcare professionals. An imminent phased introduction of care.data will begin with six pathfinder clinical commissioning groups. Snapshot extracts of 4-monthly data will be made available for research and commissioning with certain restrictions (no free text; access to anonymised data only within a secure setting on site at HSCIC; limited diagnostic codes http://www.hscic.gov.uk/article/3915/What-we-will-collect-from-GP-records-under-caredata). Chris also mentioned the ongoing pseudonymisation review being undertaken, which is exploring different techniques for preserving patient confidentiality within these data, including an option of pseudonymisation at source.

Katie Harron then presented an overview of data linkage methodology, outlining both deterministic methods (currently used by HSCIC) and more sophisticated probabilistic methods. She highlighted that differences in the way that personal identifiers are recorded can lead to errors in the linkage, and the importance of being able to evaluate the quality of linkage in terms of bias. A set of powerful examples were presented, demonstrating the extent to which analyses can be biased and flawed. Katie pointed out that pseudonymisation at source (prior to linkage taking place) makes it very difficult to identify where recording errors in personal identifiers lead to linkage error, and that requirements for data security needed to be balanced against the need to provide data that are accurate, reliable and robust.

Liz Little introduced Dr Foster, an organisation owned in part by the Department of Health and in part privately owned, with an academic partnership with Imperial College. Dr Foster use both local and national hospital data (HES) to evaluate variation in services to inform decision making for clinical and operational performance. Liz demonstrated the value of linking different data sources in terms of additional depth and specificity for evaluating surgical activity in a small number of hospitals. She expressed the opinion that care.data could enable patient-centric data to be at the heart of quality evidence for change, but that the right systems need to be in place to allow this to work efficiently.

Alison Macfarlane began by highlighting both the benefits that linkage of health records at a national level through care.data could bring and the lack of public confidence in the HSCIC and concerns about restricted access to data for researchers. She then discussed the impact of the care.data situation in terms of the public (patients being encouraged to opt out of the care.data system) and researchers (the moratorium on HSCIC data being released to researchers and the resulting delay in data access). Alison presented her projects on linking maternity health care records and highlighted that the current HSCIC data linkage algorithms, which are used for a range of purposes, are not optimal for maternity linkage. To conclude, Alison recommended that issues surrounding care.data could be solved through increased collaboration between the HSCIC, the ONS, and Administrative Data service and researchers.

A lively debate then began, with concerns raised by audience members relating to:

  • The possibility of flagging individuals with life long conditions within care.data
  • The usefulness (or not) of only a 4-month extract of data from GP records available to researchers, which does not allow coding of patients with chronic conditions
  • The potential for linkage with children’s social care data (currently only adult social care data can be linked with health)
  • The vagueness of the “benefit to health and care” in the requirements for data access (and concerns relating to commercial interests). A subsequent question was raised about what ‘for the promotion of health’ actually means, and there was clearly an appetite to get some clarification on this from DH and HSCIC.
  • Capacity issues at HSCIC and the backlog of data requests causing serious delays for researchers on grant-funded projects
  • Frustration around the access process. Researchers feel they have to provide the same information and reassurances to different groups/access committees, and would like to see the process streamlined.
  • A question was raised about whether or not the academic community is being held up and restricted by safeguards that are being put into place, and whether commissioners and decision makers are being provided with data that is of a higher quality than they are able to deal with.
  • Challenges of the black box of linkage algorithms and anonymisation procedures – the need for increased transparency, integration of research within HSCIC for developing efficient methodologies
  • Competing purposes of the data (administrative versus research)
  • Access to data that has already been linked outside of HSCIC (re-use of data) by third parties (particularly Dr Foster as it is part owned by the Department of Health). Dr Foster clarified that there would be no legal basis for the onward sharing of this data to the wider community because they are working under the instruction of NHS Trusts in a Data Controller/Processor relationship and therefore have no rights over the reuse of the data.
  • Lack of (information on) accreditation process for data intermediaries
  • Where GPs should go for routine access to outcomes and follow-up data for their patients
  • The possibility of using sampling as an alternative to population-level data, for avoiding confidentiality issues

Finally, it was suggested that there should be a legal requirement for transparency about provision of data, algorithms for linkage and anonymisation/pseudonymisation, and accuracy of linkage for data from publicly funded services.

Audio and slides from the meeting have been uploaded to the events page here.


 

care.data Social Statistics section

Join the RSS

Join the RSS

Become part of an organisation which works to advance statistics and support statisticians

Copyright 2019 Royal Statistical Society. All Rights Reserved.
12 Errol Street, London, EC1Y 8LX. UK registered charity in England and Wales. No.306096

Twitter Facebook YouTube RSS feed RSS feed RSS newsletter

We use cookies to understand how you use our site and to improve your experience. By continuing to use our site, you accept our use of cookies and Terms of Use.