The assembled panel was made up of a mix of data scientists, statisticians and those who straddle the divide. Martin Goodson chaired the evening and representing data science were Chris Wiggins (chief data scientist at the New York Times), Zoubin Ghahramani (professor of machine learning at the University of Cambridge) and Francine Bennett (founder of Mastodon-C).
The statisticians were represented by David Hand (former RSS president and professor of mathematics at Imperial College) and Patrick Wolfe (professor of statistics at UCL and executive director of the UCL Big Data Institute.) The events sponsors (Google, UK Statistics Authority, Mendeley and Qriously) also reflected this meeting of minds across the world of data analysis.
The lively discussion that followed began by considering how data scientists tend to arrive at their position from a very different starting point compared to statisticians. Data scientists often begin their journey from within computer science or the natural sciences, rather than the statistician’s mathematical route. While both eventually become fascinated by what can be achieved with data, crucially this curiosity is inspired from different angles.
But within these divergent approaches lies the collaboration that will ultimately benefit both professions. On the data science side, their experimentation with harnessing vast datasets of ‘found’ data can deliver an incredibly rich resource. On the flip side, statistics has the theoretical power to make sense of these big datasets in the same way it has small datasets down through history.
However, the data scientists did point out one major turning point on the horizon. Today’s school children are far more tech savvy than any other generation and the traditional way of teaching statistics will have trouble engaging them. As Chris Wiggins pointed out, children born after the millennium are aware of data science every time they use an internet search engine or Netflix recommends them a film. Statistics needs to adapt to this and engage the ‘internet algorithm generation.’
The UK’s national statistician John Pullinger wrapped up the discussion by pointing out that the history of the RSS is full of individuals who have used technology to harness data. Moreover, contrary to the title of the event, data science and statistics are in the same world and in keeping with the tradition of the RSS they should work together to make it a better place.
A full video of the event is available to view below.