Report from one of the sessions at the RSS 2017 Conference. More reports of conference sessions are listed here.
They say ‘a picture is worth a thousand words’, but data scientists may be able tease even more value from the millions of photos uploaded to the web each year.
At a morning session of the Royal Statistical Society Conference today (Wednesday), Suzy Moat and Tobias Preis of the Warwick Business School Data Science Lab presented an overview of their work with Flickr, an online image sharing platform, and other online services.
Preis first explained how GPS tags in Flickr photos could be used to map travel patterns of users, which may help to improve estimates of country visitors - data for which typically comes from surveys conducted at ports of entry.
The analysis of travel patterns using photo data relied on the assumption that a user remained in a specific location until such time as they uploaded a photo from another country. The research found that Flickr-based estimates of tourist arrivals to the UK (and other G7 countries) were in line with estimates derived from the UK’s International Passenger Survey (and other equivalent surveys).
Moat then discussed a project to investigate self-reported health scores sourced from the UK Census. Looking at a map of health scores alone, it was clear that people living within urban areas, particularly large cities, reported worse health scores than those in more rural parts of the country, and so the team at Warwick wondered if reported health was somehow related to the environment in which one lives, particularly how scenic it is.
To explore this, they used data from the online game, Scenic or Not?, which asks players to rate photos of England, Scotland and Wales on a ten-point scale of ‘scenicness’. The photos, in turn, come from a game called Geograph, which challenges people to photograph every square kilometre of Great Britain and Ireland.
In comparing the two sets of data, they found that people living in locations rated as more scenic reported being in better health. To check that this wasn’t simply confirming the rural/urban divide seen earlier, they split the data into rural, urban and suburban areas. They found that even within urban areas, the parts of towns and cities that scored higher on scenicness were also likely to score better on reported health.
Moat acknowledged that it was still early days for the research programme, and more work was needed to investigate whether there was a causal mechanism at work between scenicness and reported health. But the project served as another demonstration of the potential value of online photos.
- Suzy Moat and Tobias Preis were speaking during the session 'Data science for public good'.