What is a data scientist and do I need one?
A good friend once told me: “If your profession is not represented by a cartoon animal, your job description is made up and society does not need you”. This is when I realized that my life was a lie and I was condemned to eternal despair. But I digress. On the subject of data scientists, this is a role that has recently been introduced to the market place, so I think it’s important to ask ourselves what this role is and who can benefit from a data scientist.
The evolution of data
In the recent years, data has experienced profound changes. Not only the technology behind data storage and management has dramatically evolved from standard relational models to distributed solutions, the place of data in the enterprise and in the mind of people has changed. Suddenly data is becoming a sexy buzzword instead of being a necessary evil. Indeed, data has become its own entity within business organization with entire teams dedicated to it. Companies now no longer ask “do I really need to keep this data?” but “how can I make sure that I keep all the data I have?”. With this advent, new roles started to emerge and this is when “Data Scientists” have been introduced.
Buzzword or actual role?
Many argue that data scientists are just a fancier replacement to the role of business/data analysts; “A Data Scientist is a Data Analyst Who Lives in San Francisco” as you can read in this article (a very good read I might add). I agree to a certain extent: data scientists are people that are diving into software to get results that will ultimately help make business decisions. Companies leadership have always relied on this type of analysis from experts called business analysts. Business analysts even use business intelligence software to do data mining and generate statistics and guide business solution, which are some of the principal prerogatives of data scientist.
But I do think there is a fundamental change to be considered: data platforms are now a separate piece of software. Before the advent of big data, software used data layers. Nowadays, you have data lakes, data virtualization layers, real time data warehousing that are their own entities. Using these platforms require a combined set of skills: know how to use data platforms intimately (skills formerly owned by data administrators) and be able to generate business intelligence data out of them (skills formerly owned by business analysts).
As such, I think that a new designation for this combined set of skills is fair; and it looks like Wikipedia agrees with me by calling data science an interdisciplinary field: “Data science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics, similar to Knowledge Discovery in Databases (KDD).”
Do I need to hire a data science team?
I think that there is a better question to be asked: “what am I doing with my data?”. Don’t get me wrong, the trend of wanting to accumulate as much data as possible is great. Especially great for me that work for a company that provides data management solutions. But I have seen implementations of massive data lakes taking years and month and very little use out of them, and this is a shame.
New data platforms gives business a tremendous opportunity. Instead of relying on the wisdom of visionaries or accumulated experience to make difficult business decisions, we get to gather evidence and make an informed decision. But you need to know what you want to know first. Once you do, then you can decide which platform is good for you and what type of data scientist you should hire. This will give you much more tangible results than buying a huge data platform, hiring an army of data scientists and do fundamental data research. OK, I made that last part up, fundamental data research is not a real thing… yet!