Open data visualisation: the dawn of understanding?

Written by Alasdair Rae on . Posted in Features

Much has been made in recent years of the rise of open data and the possibilities it offers. Whilst achieving much less fame than big data, its more glamorous cousin, open data has begun to revolutionise the way we can understand the world. In particular, it has been seized upon by experts in urban data visualisation in ways that would not have been possible only five years ago.

However, there has also been something of a backlash against what critics perceive to be ‘fancy graphics’ and ‘pointless maps’, so in this short piece I focus on some fundamental open data questions and offer two concrete examples where open data has increased transparency, improved access to information and helped places begin to understand and solve problems.

For me, there are three big questions everyone should ask about open data. First, we must ask ‘opened by whom’? The answer to this question in my view says a lot about the intentions and aspirations of the organisations that release data. Or, perhaps it is a way to avoid endless freedom of information requests from journalists and academics. Either way, it leads to an increase in transparency, which in most cases is welcome.

Second, we must ask, ‘open to whom’? This question for me is also critical since the assumption that everyone is clamouring to get their hands on raw data seems fanciful. People want to know what the raw data can tell them, but this often requires careful intermediate analysis by skilled professionals (part of the much-cited data-knowledge-information-wisdom hierarchy). This means we need organisations to agree to and follow standards, such as those established by the Open Data Institute.

The third question I always ask is ‘open to what’? I ask this because it is possible that open data can be seen as a solution in search of a problem. Is open data immediately open to analysis and critique? The Open Data Institute says that good open data can be linked to, easily shared and talked about and that it should be available in a standard, structured format so it can be easily processed. I wholeheartedly agree with this and in my work I often take a visual approach with large open data sets so that people can begin to understand urban and regional trends and patterns in a way that would be impossible by simply looking at a massive spreadsheet.

After considering the three questions posed above, we can often understand more about the current state of open data. Take, for example, the recent release of postcode-sector mortgage lending data in Britain. In relation to the first question (opened by whom?), the Treasury encouraged major banks to release such data in the interests of transparency in the financial sector. This was a welcome development, and seven major banks signed up to this. The second question (open to whom?) was answered in December 2013 when mortgage lending data were released through the Council for Mortgage Lenders, meaning anyone with an interest could explore local lending patterns.

At the time, this new open data attracted significant media interest both nationally and locally. However, the third question (open to what?) revealed some interesting differences in open data practice. Six lenders released standardised spreadsheets containing their lending data but one bank (HSBC) simply released large PDFs with details of mortgage lending across Britain’s 10,000 postcode sectors. This makes the data ‘open’ in theory but very hard to use in practice and, I would argue, works against the very principles of transparency ‘open data’ seeks to serve.

{mbox:lightbox/rae-mortgage-lending.jpg|width=300|height=249|caption=Click to enlarge|title=Mortgage lending in Great Britain}

This leads to my first example of how open data can be made more usable and shareable through visualisation, in a very simple way. Since I’m interested in urban data and how neighbourhoods function, I was very excited by the release of postcode-sector mortgage lending data. However, I also knew that it would be difficult for people to understand the patterns in the data without some additional work. Therefore, I created a simple mortgage lending map site where users could explore mortgage lending patterns across Britain but also query the map to compare lending from different banks in each postcode-sector, as demonstrated on the right.

Bringing the data together in one place required time and some expertise, as well as the integration of open geodata from a company called Geolytix, but the end result is that we can now have a national mortgage lending map which we can use to identify areas where lending is higher or lower than might be expected. The leap from open data to useful knowledge is not necessarily a big one, but it requires time and effort - a situation all too quickly overlooked when we think about open data.

My second open data example comes from the much-maligned city of Detroit in the United States. Its rise and fall over the past 50 years has been well documented but until recently the state of the city on the ground had not been systematically catalogued and analysed. It has lost the equivalent of the population of Birmingham (1 million) since 1950 and the population now stands at 700,000.

This has left a legacy of blight and abandonment, but until the intervention of the Blight Task Force and the Motor City Mapping project the precise scale of the problem was not well documented. Over a period of two years, citizens of Detroit worked with Motor City Mapping to create a vast open dataset of the city’s 380,000 land parcels, resulting in significant new knowledge on the scale, location and severity of Detroit’s blight problem. Understanding the problems with this level of detail has been instrumental in developing plans to address the situation. Open data cannot save Detroit, but it serves two important purposes in this case:

{mbox:lightbox/rae-detroit-property.jpg|width=300|height=204|caption=Click to enlarge|title=Property conditions in Detroit}

i) It allows the scale of the problem to be simply and quickly communicated to a wide audience (see example on the right.)

ii) The dataset provides a fine-grained record - and photograph - of every land parcel in the city and a means by which policymakers can prioritise their actions.

Will open data save the world? No, of course not. But through the release of more and more datasets, and the addition of careful analysis and visualisation by skilled analysts, we can begin to understand our world better. Open data will never tell us what we should do, but very often it will indicate that we should probably do something, and that’s why I like to think of open data visualisation as the ‘dawn of understanding’.

 

The views expressed in the Opinion section of StatsLife are solely those of the original authors and other contributors. These views and opinions do not necessarily represent those of The Royal Statistical Society.

Open Data Data Visualisation

Join the RSS

Join the RSS

Become part of an organisation which works to advance statistics and support statisticians

Copyright 2019 Royal Statistical Society. All Rights Reserved.
12 Errol Street, London, EC1Y 8LX. UK registered charity in England and Wales. No.306096

Twitter Facebook YouTube RSS feed RSS feed RSS newsletter

We use cookies to understand how you use our site and to improve your experience. By continuing to use our site, you accept our use of cookies and Terms of Use.