You will often hear people involved in data and analytics described as “data scientists”. But if you meet one, it’s unlikely he or she will be wearing a lab coat. And their office is likely to contain simply a single computer, rather than benches of apparatus and instrumentation. So are they really scientists? Or is it just a “buzz word” job title designed to make them look more intellectually worthy than they are?
Well, a “science” is a field of study in which it is possible to draw conclusions and advance knowledge through a process of theorising, experimenting and analysing results. If you’ve been involved with business analytical projects you’ll recognise that’s generally how they work, too. Therefore, a person collecting and analysing data and using it to increase their knowledge, is a data scientist.
It is a fairly new term – recorded as first being used in 1960 but not coming into widespread use until the 1990s. Before then, the work and study which is now carried out by what we would call a “data scientist” was simply thought of as a branch of statistics, and its practitioners were statisticians.
However during that same time period, another field of academic study rose quickly in popularity and prominence. And students of this other new science – computer science – found that the technologies and techniques they were developing could be merged very effectively with those being developed by statisticians.
This led to a huge increase in the amount of data that can be generated, stored and analysed, as well as the speed of that analysis, and therefore the rate at which knowledge could be generated from data. And the crux of the matter of data science is the extraction of insights from data.
Of course, simplifying matters to that extent, means that anyone simply turning any data into insights is engaged in data science – for example, reading a text book. And, to be honest, they are!
But to really qualify as a scientist, as I mentioned above, you should be putting the information through a rigid and formalised, scientific process, involving identifying a problem that needs solving, theorising how it could be solved, and experimenting using your data to attempt to find a solution. You should also record your results in a standardised way and present them for review and verification to others with knowledge in the field.
This closely reflects the processes carried out every day by professionals with the job title of “data scientist”. In business, the problems will be dictated by commercial goals, and the experimentation will take the form of model-building and simulation. The goal will be to create results that fit the goals, and are also repeatable because we understand exactly how they came to be. Just like a real scientist!
Generally speaking, data science represents the convergence of three previously separate (though closely related) scientific disciplines – statistics, mathematics and computer science.
So, in some ways it’s a patchwork of existing bodies of knowledge and methodologies. But the process of putting them together gives rise to possibilities beyond those offered by any one individual area.
Some still argue that data science is still just an extension of the study of statistics, boosted by better computing power and increased storage, and to be fair, they do have a good point. But as with everything today it’s largely a matter of branding, and “data scientist” certainly sounds sexier in my opinion than “statistician”. Universities and colleges are jumping on the band wagon, increasingly offering courses at undergraduate and post-graduate level titled “Data Science”.
So, there’s my overview of what exactly is meant by the term “data science”, why I feel it deserves the title of “science” (and why its practitioners deserve to be called “scientists”) and why it is so fundamentally important to this new 4th industrial revolution.
For more, check out these articles:
The 9 Best Free Online Big Data And Data Science Courses
The 6 Key Data Science Skills Every Business Needs Today
The 6 Best Data Science Master’s Degree Courses In The US
Forget Data Scientists And Hire A Data Translator Instead?