Chances are we all get probablility

Written by Web News Editor on . Posted in Features

With advances in technologies like cancer screening, we need to be as clear as possible when stating results in terms of probabilities.
It’s not just patients who sometimes find risks and probabilities difficult to understand. Doctors can be challenged by them too.
In an experiment in 2004, psychologist Professor Gigerenzer and colleagues at the Max Planck Institute for Human Development set a group of experienced doctors the following problem:
About 1% of women have breast cancer and a cancer screening method can detect 80% of genuine cancers but also has a false positive (or false alarm) rate of 10%.  What is the probability that women whose test produces a positive result actually do have breast cancer? Most of the doctors thought it was 70%.
Another set of doctors were asked the same question, but this time they were given the data as whole numbers. They were told that 10 in every 1,000 women have breast cancer and that, of these 10, 8 will give a positive screening result…while of the 990 who do not have cancer, 99 will produce a false positive result. Asked to estimate the probability that women with a positive result have cancer, most of the doctors could see that it was 8/(99+8), so roughly 7.5 %.
Changing raw probabilities into hard numbers helps makes things much clearer.  Indeed, simply changing 8% of people to 80 people out of every 1,000 can make a big difference in understanding.
Knowing how the human mind computes clearly helps when it comes to deciding how best to communicate probability. Percentages depend on your base figure and context doesn’t always make the base figure clear.  The same information presented as ‘counting heads’ clearly spells out the group being referred to at each stage and so avoids the problem.
So chances are we do all get probability, it’s how it is explained and communicated that matters most. And when it comes to worrying about having or not having serious conditions, it can make the difference between reassurance or added anxiety.
See the Understanding Uncertainty site for a great animation on screening.

It all depends what you mean by average

Written by Web News Editor on . Posted in Features

Statisticians often make quite a fuss about the various ways of measuring the average – and that’s because averages used wrongly can give a very misleading impression.
The following story, based on a survey of 2000 drivers, appeared in the Metro newspaper. And it raises quite a lot of questions.
A typical driver will jump 87 red lights, spend 99 days stationary on gridlocked roads and share 680 kisses during a lifetime behind the wheel. Motorists will get stuck in traffic 10,000 times, make 1,992 phone calls and check for emails or texts more than 1,000 times.  The average driver will cover 269,296 miles. And the typical adult will have sex 4 times in the car from the age of 17.
The best place to start is with the numbers. How reliable are they? Does anyone really know the number of red lights they jump, or the number of kisses they share? Why is the number of phone calls (1,992) so precise, while the number of times stuck in traffic (10,000) is a round number which sounds like a guess? And how can it possibly be justified to quote the distance travelled by the average driver (269,296 miles) to six significant figures?
Then there are the issues to do with the different types of average. It is not clear whether the figures are means, medians or modes – or a mixture of all three. Perhaps the most common number of red lights to jump is 87, in which case it is the mode. Or perhaps 50% of motorists jump fewer than 87 red lights and 50% jump more, in which case it’s the median. And if the total number of red lights jumped divided by the total number of motorists comes to 87 it’s the mean.
But does this distinction matter? Yes! Look at the last sentence in the story. My guess would be that motorists are divided into two groups: those who do and those who do not. And I would further guess that those who do not are in the majority. So the modal number of times for a motorist to have sex in a car is, I imagine, zero. And then for those who do, both the mean and median will be much higher than 4.
Combining two quite distinct subgroups can give a completely misleading impression. It’s not just a matter of saying what sort of average you are using, but also thinking carefully about whether any sort of average is appropriate at all.
goodStats (communicating the information in numbers)
Some great and some not-so-great stats and thoughts on how bad stats can be made good

Neil Sheldon has taught at The Manchester Grammar School for 40 years. He is a Chartered Statistician and Fellow of the Royal Statistical Society. He has been an RSS Guy Lecturer since 2007.  He is also course leader for the Certificate in Teaching Statistics offered by the RSS Centre for Statistical Education


New school league tables - A user's guide

Written by Web News Editor on . Posted in Features

Scotland, Wales and Northern Ireland have taken the decision not to publish school performance indicators. This is not the case in England. Last week saw the release of the DfE’s school league tables based on the results of Summer 2012 exams in 4,000+ state and independent secondary schools.
During the same week, the British Academy’s Policy Centre published ‘School League Tables: a short guide for head teachers and governors‘. The guide’s author, Harvey Goldstein FBA, Professor of Social Statistics at the University of Bristol, warns that school league tables seem to offer an easy way of seeing which schools are doing well compared to others but they are too simplistic a measure to adequately examine the relationship between the quality of what schools provide and the results of tests and exams.
The guide will make head teachers and school governors (and – in our view – parents, press and policy makers too) more confident in their knowledge of what school league tables can and cannot tell you. getstats encourages anyone who might benefit from the guide to take a read.
For more on measuring school performance, including a review of the available evidence to determine the benefits and the problems associated with use of school league tables, see ‘Measuring Success’, a report written by Professor Goldstein and Beth Foley and published by the BA’s Policy Centre in March 2012.
For further data presentations and visualisations around school performance, see the Guardian’s DataBlog.

Extrapolations don't make good forecasts

Written by Web News Editor on . Posted in Features

Adverts for financial products say – though usually in tiny print at the bottom of the page – past performance is no guide to how things will be in future. Stuff happens, such as banks collapsing, stock markets imploding, wars, pestilence (and their opposites, too, booms and prolonged prosperity included).
Put the point in the sort of language statisticians use and you might say extrapolating from yesterday’s trends makes for a dubious forecast of what is to come.

Drive for better stats consumption and production in the voluntary sector

Written by Web News Editor on . Posted in Features

The push is on to help charities and the voluntary sector to be better consumers and producers of statistics. At a recent meeting convened at the Royal Statistical Society, representatives of the National Council for Voluntary Organisations, the UK Statistics Authority, the Third Sector Research Centre and major players in the sector such as the Joseph Rowntree Foundation resolved to work together to improve uptake and training.
For its part the government in the shape of the Office for Civil Society – part of the Cabinet Office – is mounting a new community survey to provide data on volunteering and charitable activities, a part replacement for the Citizenship Survey, which the Communities and Local Government Department decided to end two years ago.
The third or voluntary sector is data rich. Delivering services depends on good knowledge of people and social conditions; charity trustees and grant givers need to measure the effectiveness of their work; voluntary bodies produce data about the people and causes they serve.
But this ‘sector’ is highly differentiated. Many charities are tiny and intensely local. They may lack the time, energy and capacity to conduct reliable sample surveys, read spreadsheets or advance much beyond story-telling and subjective impressions. ‘Evidence is the plural of anecdote,’ the RSS meeting was told. Few charities have the wherewithal to mount a randomized control trial, the gold standard for assessing the effectiveness of social interventions.
Official statistics are not always useful, because they cannot be broken down to the local areas, communities and estates where charities operate.
Can norms around the use of data, statistics and evidence be changed? Only if a concerted effort were to be made to educate trustees and staff, and to provide consultancy and external assistance – which is where RSS getstats comes in. But ‘capacity building’ funds of the kind that were available under the previous government have been cut.
In a report last year the UK Statistics Authority saw the need for ‘ongoing support and closer engagement with experts’. Umbrella groups such as the National Council for Voluntary Organisations (NCVO) should get closer to those producing statistics, and shape their decisions about what data to collect.

Debt of work and pensions invites us to explore it's open data

Written by Web News Editor on . Posted in Features

Since proposed reforms of the welfare system were announced two years ago, we have regularly heard reference to Universal Credit, the new benefit set to replace six of what are currently the main means-tested welfare benefits and tax credits. But without a sense of how levels of existing claims have changed over time, how many people are claiming in each authority, the age profile of claimants etc…without having access to DWP’s data, it’s been hard to gain a big picture sense of the impact the new benefit would have. The government’s Open Data initiative is beginning to turn this around.

Last week, the Department of Work and Pensions (DWP) launched ‘People and Households claiming Universal Credit, Personal Independence Payment and other benefits’ a consultation on proposals to change the way its statistics are collected and disseminated. The consultation sits alongside the introduction of the new credits and payments and ties in with the government’s White Paper on Open Data which states that departments will “get more data into the public domain and make sure that data is trustworthy and easy to use”.

DWP managers plans to use what users tell them to shape DWP statistics through to 2017 and beyond. The consultation document includes a series of proposals on which views are sought. Key amongst them is how to further develop Stat-Xplore, an online interactive data analysis and visualisation tool that allows users to explore benefit data and see results in the form of interactive charts and graphs which they can navigate, download and share.  DWP sees Stat-Xplore as a key tool in modernising DWP’s statistical output (see the DWP Open Data Strategy).

For now Stat-Xplore only contains data on housing benefit claimants but there are plans for it to soon cover Universal Credit and Personal Independence Payment data. Also data on those affected by the Benefit Cap once it is introduced. These, and other statistics, will be produced monthly and published three and a half months after the reference date. The intention is to enable Stat-Xplore users to create their own sets of data and compare official claims data by date, local authority, parliamentary constituencies, regions and claim type.

If you would like to feed in your views of Stat-Xplore, the deadline to get back to DWP is 24 April 2013. If you have wider views on open data, share them via this website or via the RSS’s StatsUserNet website


Keep calm and carry on debating

Written by Web News Editor on . Posted in Features

Think back…how often in debate – on the radio, on TV – have you heard people state that “x is linked to y”?. ‘y’, for example, could be cancer or the economic slump and ‘x’ anything from pollution levels to bacon consumption, low confidence to the weather.  In saying that the two are linked, they are really only referring to an association, a statistical pattern, between them. But the implication, sometimes implicit, sometimes explicit, is that ‘x’ causes ‘y’.
But where is the evidence? The job of statistical tests is to tell us whether correlation between two measurable things (what statisticians term ‘variables’) is down to coincidence or otherwise significant. But even when there seems to be strong correlation, this still does not prove causation.

Join the RSS

Join the RSS

Become part of an organisation which works to advance statistics and support statisticians

Copyright 2019 Royal Statistical Society. All Rights Reserved.
12 Errol Street, London, EC1Y 8LX. UK registered charity in England and Wales. No.306096

Twitter Facebook YouTube RSS feed RSS feed RSS newsletter

We use cookies to understand how you use our site and to improve your experience. By continuing to use our site, you accept our use of cookies and Terms of Use.