In celebration of the day, I quizzed our data scientist, Sam Leach, on all things data – to find out where he thinks the benefits and opportunities lie.
Katrina Scott: What first sparked your interest in working with data?
Sam Leach: In the previous phase of my career as an academic and research scientist I worked at the coalface of observational cosmology, and it was there that I got excited about working at the crossroads between data, statistics, computation and machine learning, in order to shed light on ‘how’ and ‘why’ questions.
KS: Why do you think data innovation is so important?
SL: To me ‘data innovation’, like ‘data science’, is just the outcome of tackling questions where people, process and technology each play a role in figuring out answers. It’s more than just number crunching or improving your use of data, it is about creating changes in behaviour, and that’s why I think it’s having such a pronounced impact and creating such a stir in our culture and society.
It’s exciting to see data science – with its particular set of skills and attitudes geared towards problem solving and making predictions – gaining traction in new domains such as transport, education, and health, besides those where it is already quite developed, such as advertising and e-commerce.
KS: How does open data fit in?
SL: It’s been exciting to watch the open data movement gaining momentum and recognition recently. Open data is the type of data that can be reused without any restrictions, and so acts as a spur to both innovation as well as community building and strengthening.
One of the most valuable resources we use in our data science and data visualisation work is OpenStreetMap (OSM), which is like the Wikipedia of world maps released under an open license.
OSM has a strong community and ecosystem so for example, there is a team of mapping volunteers who work to improve the state of maps in areas hit by humanitarian crisis. Up to date maps are needed by relief organisations to help them navigate the area and allocate resources.
KS: What are the biggest challenges businesses face when it comes to data?
SL: I think businesses find dealing with incomplete and semi-structured historic data sets a challenge – it can be hard to know where to start when information is being held in lots of different systems.
Knowing what data to collect and how to structure it for future use can be difficult, but this is important when you want to move beyond historic analysis and begin to make predictions – to me, this is the really exciting part.
It’s also vital for insights to be explained clearly, so they can be understood by people at all levels of the business – not just analysts.