Medical research needs data scientists like never before. Dr David Crosby, programme manager for methodology and experimental medicine at the Medical Research Council (MRC), explains why - and which of MRC’s funding programmes are seeking to utilise statisticians and data scientists.
The Medical Research Council (MRC) recognises that it's essential to invest in statistical and data science so we can extract maximum value from biomedical research. Statistics and data science underpin a vast range of medical science applications: clinical trials methodology, mathematical modelling to interpret imaging data for diagnostic use, computational informatics to identify health risks, and improving our understanding of disease mechanisms.
The MRC’s funding Boards, which include infection and immunity (IIB), molecular and cellular medicine (MCMB), neurosciences and mental health (NMHB) and population and systems medicine (PSMB) and fellowship panels, are keen to support applications involving statistics and data science. MRC has placed great importance on developing and implementing these skills and techniques, and through a number of initiatives there are many opportunities to obtain funding from MRC.
To ensure that medical research is conducted in the most thorough, efficient and robust way possible, novel methodologies need to be developed and implemented. This means discoveries are more reliably and quickly turned into benefits for patients and the general population, and that health research and policy are built on the best possible evidence.
The MRP panel, which funds methodological research projects, has a strong commitment to the development and application of statistics and data science methods in biomedical research. Dr Duncan Lee from Glasgow University (who is funded through MRP) explains why it's so important to employ the right statistical model. 'All statistical models are based on a set of assumptions, which represent a simplified understanding of the biomedical process one is trying to model,' he says. 'Simplistic statistical techniques that make naïve assumptions about the data being modelled are commonly used in biomedical science - due to them being easy to understand and easy to implement in practice and standard software packages. However, it is well known that fitting overly simplistic models can result in incorrect inference, which in turn could lead to invalid conclusions.
'Statistical methodological and software development is thus vitally important, so that more realistic models that make less unrealistic assumptions about the biomedical data and process being modelled can be developed. More realistic models will yield results that are likely to be more accurate and so researchers can have greater confidence in the resulting conclusions drawn, and thus the potential impact on biomedical policy and practice.'
Dr Lee's MRC funded project is developing statistical methodology (and accompanying software) for identifying temporal trends and spatial patterns in disease risk in small-area data. 'The methodology and software are being made available by the CARBayesST package in the statistical software R, which is free to use for all,' he continues. 'Two examples of the use of this methodology relate to susceptibility to measles in Scotland, and smoking rates amongst pregnant women in Glasgow. The aims in these analyses are to identify where the high-prevalence areas are which need to be targeted for reduction, and which small-areas are not showing decreasing trends in prevalences.'
MRC Skills Review
Quantitative skills were identified as a priority area in a recent MRC review of priority areas for training and skills development. To support development in quantitative skills, the MRC offers training and capacity building opportunities for early-career statisticians and data scientists through several routes:
- Skills Development Fellowships: Personal training fellowship awards with 100% salary plus research costs. These fellowships support talented researchers to develop and consolidate their priority quantitative skills, and support researchers at all career stages who wish to bring quantitative skills into biomedical research and undertake relevant training.
- New Investigator Research Grants (NIRGs) – project funding to provide support for early-career researchers while they are establishing themselves as independent principal investigators. Available through the Methodology panel and all MRC Boards.
- Hubs for Trial Methodology Research – PhD training relevant to application of statistics in clinical trial methodology.
In addition to these opportunities, MRC also supports statistician and data scientist career development at later stages through the Career Development Award and Senior Non-Clinical Fellowship schemes.
The 'big data' explosion is reflected in the biomedical and health sectors, with huge volumes of data created through large population studies, imaging and high-throughput technologies such as genomics and increasingly mobile health technologies such as wearable sensors. The NHS contains a wealth of longitudinal health data from cradle to grave on the UK’s 65m population. The ability to utilise NHS electronic health records to identify effective treatments, monitor drug safety, identify public health risks and provide insights into the cause and development of diseases offers the UK unrivalled opportunities in health research on the global stage. Likewise, high-throughput technologies for analysing genomes and biological processes in health and disease are transforming medical research and development and will revolutionise individual healthcare. In parallel to low cost whole-genome sequencing, large-scale metabolic profiling, imaging and immunological techniques are providing vast depths of information at cell/tissue/organ levels. Future biomedical research in industry and academia will involve the study of immensely large and complex biomedical data sets. All of this will require extensive statistics and data science expertise and capacity. To this end, MRC and its partners have invested in infrastructure and capabilities to underpin UK health and medical bioinformatics research, which include:
- Establishing the £39m Farr Institute of Health Informatics Research, a network of four centres of excellence with 19 partner institutions providing a cutting-edge platform for 'big data' e-health records research.
- Investing £39m in 2014 into six strategic Medical Bioinformatics awards, creating supportive infrastructure for statistics/data science-driven research into big data arising from genomics, proteomics and other such sources.
- A collection of large-scale population studies, including the world’s longest running birth cohort, the world’s largest longitudinal study of women’s health and UK Biobank; a major national resource collecting information and samples on 500,000 adults.
MRC has invested more than £100m in the last two years in informatics capabilities and infrastructure, in addition to ongoing support for quantitative skills development and capacity building. It will continue to support applicants proposing to develop and improve statistics and data science in medical research, and provide opportunities across all career stages to develop these skills and techniques.
Individuals interested in developing applications in these areas are encouraged to contact relevant MRC programme managers.