In this article we will talk about the differences between professions that are often confused or even considered the three names of the same activity.
These professions are:
- Data Analyst
- Data Mining Specialist
- Data Scientist
It is worth saying that in fact there are no official definitions of each of these professions and, accordingly, it is not clear how to distinguish them from each other.
So we offer our version of what these professions differ in – based on data from foreign blogs, foreign vacancy announcements and, of course, our own opinions.
Data Analyst is a person who performs descriptive analysis of data, interprets it and presents a report to interested parties.
That is, the main skills of this character are:
- Excellent knowledge of the subject area, within the data is analyzed. The subject area is defined as a specific business area (for example, the oil and gas industry or elite alcohol trade).
- Understanding of the business features of the company he works in
- Good presentation skills
- Knowledge of some means for data visualization (for example, Tableau) and the ability to make graphs that are nice and understandable to non-specialists
- Basic knowledge of statistics, ability to use simple systems for data analysis (for example, Excel)
- Perhaps (but not necessarily) knowledge of a programming language
Data Mining Specialist
Data Mining Specialist is technically savvy specialist who carries out the full cycle of work with data – from the search for this data and ending with the creation of predictive models. During the data processing, he focuses on the identification of some unknown until now hidden patterns and with might and main applied Machine Learning technologies.
That is, the main skills of this expert are:
- Good mathematical training
- The ability to find and properly prepare data
- The ability to program in one or more languages. These languages are usually high-level, like Python, Java, Matlab or R
- Knowledge of machine learning methods and algorithms. This may include statistical algorithms, neural networks, and genetic algorithms – thousands of them.
- Perhaps (but not necessarily), the ability to work with big data (Big Data) – meaning Hadoop, its standard and non-standard modules.
Data Scientist is a universal player that can do both what the data analyst does and the data mining expert does. And plus to this, he has some special skill or a particularly narrow specialization.
The main skills of this expert are:
- Excellent presentation skills, knowledge of the subject area and the ability to present the results of their work to non-specialists (this is from the data analyst)
- good mathematical training, data preparation skills, machine learning (this is from a data mining expert)
- ability to work with Big Data (very desirable, almost necessary)
- some special skill or additional specialization (for example, knowledge in the field of linguistics – several foreign languages, the ability to work with the text at an advanced level, ie, Natural Language Processing)
However, with a scientist according to the data is not so clean – he may not have half of the above skills, but still be considered a scientist according to the data, if, for example, he owns other skills perfectly. Let’s say he may not know mathematics very well, but be a great expert in the subject area. Hopefully sometime later I will take a closer look at the classification of scientists by data.
It should be noted that the so-called “pure” representatives of the professions are considered here. In real life, for example, the data analyst may have more skills, but a data scientist may not have any special skill.