Techno Blender
Digitally Yours.

Learn to differentiate these data roles | by Javier Fernandez | Jun, 2022

0 177


Data Scientist, Data Analyst, Data Engineer, AI Researcher, ML Engineer, ML DevOps, and Business Analyst, among others

Photo by Rene on Pexels.

If you are reading this article, you have probably also faced the difficulty of differentiating between various data roles such as Data Scientist, Data Analyst, Data Engineer, AI Researcher, ML Engineer, and Business Analyst, among others.

This is likely due to the employees’ job titles compared to their responsibilities within the company for the projects they work on. Especially, the smaller the company, the greater the difference between the profile and roles of an employee.

Therefore, this article summarizes as accurately as possible some of the more “common” data roles. To do so, the article first describes the data roles shown below, then diagrams the main connections between them and summarizes the main differences between the roles that can create confusion.

Figure 1. Data roles. Ref: Image by author.

Data roles

  • Data analyst: Answers questions by analyzing the data and interpreting the results.
  • Business analyst: Processes, interprets and documents business processes, products, services and software through analysis of data to help to form business insights and make more effective business decisions.
  • Data visualization engineer: Design data reporting solutions that simplify the understanding of the data and its insights.
  • Data scientist: Focuses on solving problems by developing AI-based models more than giving answers to questions.
  • AI researcher: Explore and propose new systems of solving problems that are then used by other roles for real-world scenarios.
  • Data engineer: Builds data pipelines to bring together information from different source systems.
  • Data architect: Defines the policies, procedures, models and technologies to be used in collecting, organizing, storing and accessing company information [1].
  • ML engineer: Builds and maintains artificial intelligence systems to automate predictive models.
  • MLOps engineer: Provides data scientists and other roles with access to the specialized tools and infrastructure (e.g., storage, distributed computing, GPUs, etc.) they need across the data lifecycle. They develop the methodologies to balance unique data science requirements with those of the rest of the business to provide integration with existing processes and CI/CD pipelines [2].

Data roles diagram

If we connect the roles based on their needs and dependencies, we end up with the diagram shown in Figure 2. Below is a description of each block as well as the responsibilities of each role when asked to develop a forecasting model (illustrative example).

Figure 2. Main relationships between data roles. Ref: Image by author.

1. Data preparation:

Data architects plan the organization’s data framework and discuss it with the data engineers, who implement it. Data engineers then create the ETL pipelines that generate the data used by data scientists, data analysts, and business analysts.

For the forecasting task, these two roles would be in charge of building the data framework to collect the data, as well as developing the ETP pipelines to create the dataset.

2. Data modeling:

Both data analysts and business analysts work with data, the main difference lies in what they do with it. Business analysts use data to help organizations make more effective business decisions. In contrast, data analysts are more interested in collecting and analyzing data to give answers to specific questions [4].

In our example, the required role would be the data analyst, who will perform the explanatory data analysis to get the insights.

3. Data analysis:

AI researchers investigate new AI-based methods, which are usually published in an open-access archive such as arXiv or conference proceedings. Data scientists are the ones who mainly use this information to solve their tasks.

In our example, the data scientist would develop the forecasting model based on the current literature, which comes from the research of AI researchers.

4. Deployment and monitoring:

Once the data scientists have built a model, the machine learning engineers work to ship the model into a production environment. Additionally, data visualization engineers design the interface to report the solutions.

For the forecasting task, the ML engineers would take the model developed by the data scientists, optimize it, and put it in production. ML engineers do not necessarily need to understand how the model works. Data visualization engineers would design the interface that is displayed to users.

5. Infrastructure:

MLOps Engineers enable the other roles by building and maintaining a platform to enable the development and deployment of machine learning models.

In the given example, these engineers would develop all the specialized tools and infrastructure (e.g., storage, distributed computing, GPUs, etc.) needed throughout the data science lifecycle.

Data roles differences

This last section focuses on mentioning the differences between some of the most difficult roles to differentiate. The task of creating a forecasting model will be also used as an illustrative example to compare the roles.

Data architect vs Data engineer

Data architects design the vision and blueprint of the organization’s data framework, while the data engineer is responsible for creating that vision [3].

For the forecast example, the data architect would focus on defining the policies, procedures, models, and technologies, as well as providing insights to the data engineers on how to organize the data and in what format the data should be presented. Thereafter, the data architect relies on the concrete vision of the data engineer to collect the data, store it in the system, and prepare it for analysis.

Business analyst vs Data analyst

While data analysts and business analysts both work with data, the main difference lies in what they do with it. Business analysts use data to help organizations make more effective business decisions. In contrast, data analysts are more interested in gathering and analyzing data to evaluate and use to make decisions on their own. [4].

For the given example, data analysts would be responsible for modeling forecasting problems, discovering insights, and identifying opportunities through the use of statistical, data mining, and visualization techniques. In contrast, business analysts would apply a breadth of tools, data sources, and analytical techniques to answer a wide range of high-impact business questions and present insights concisely and effectively. Examples of such business-critical insights include providing estimates of new programs/products and market launches and diving deep into recent events to understand the impact on contact volume.

Data analyst vs Data scientist

Data analysts are generally associated with giving answers to questions, while data scientists focus on solving problems by developing AI-based models using the found insights of the data.

For the forecast example, data analysts would focus on discovering insights into the data by using methods such as stationarity, correlation, autocorrelation, multicollinearity, etc. On the other hand, data scientists, based on the insights of the data analysts, would focus on building the model, probably implementing an LSTM neural network, transformer-based neural network, or a 1D convolutional neural network, among others.

Data scientist vs AI researcher

AI researchers explore new ways and propose new systems of solving problems, while data scientists tweak and apply these systems in real-world scenarios [5].

For the forecast task, there is a clear gap between both roles as AI researchers would not implement this type of task as their job is to work with publicly available datasets so they can compare their proposed methods with state-of-the-art ones. This task would be given to data scientists, who would use the SOTA models to find the optimal solution.

Data scientist vs ML Engineer

Data scientists deal with the modeling side of the algorithm, whereas ML engineers focus on the deployment of that same model. Data scientists focus on the ins and outs of the algorithms, while machine learning engineers work to ship the model into a production environment [6].

For the forecast task, data scientists focus on building the forecast model, trying to decrease the error as much as possible. Once finished, the ML engineer is in charge of shipping this model to production, retraining it, and maintaining it.

ML Engineer vs ML DevOps

ML Engineers focus on implementing and retraining machine learning models, whereas MLOps Engineers enable ML Engineers by building and maintaining a platform to enable the development and deployment of machine learning models [7].

For the given example, the role of ML Engineers is to ship the forecasting model to production in the infrastructure maintained by ML DevOps, who may be asked to install, for example, specific libraries or CUDA versions.

References

[1] https://www.techtarget.com/whatis/definitions/D

[2] https://blog.dominodatalab.com/7-roles-in-mlops

[3] Striim, Data Architect vs Data Engineer

[4] Northeastern University, Data Analyst vs Business Analyst: What’s the Difference?

[5] Intern Khoj, AI Research

[6] Medium, Data Scientist vs Machine Learning Engineer Skills. Here’s the Difference

[7] Neptune blog, Is MLOps Engineer a thing?


Data Scientist, Data Analyst, Data Engineer, AI Researcher, ML Engineer, ML DevOps, and Business Analyst, among others

Photo by Rene on Pexels.

If you are reading this article, you have probably also faced the difficulty of differentiating between various data roles such as Data Scientist, Data Analyst, Data Engineer, AI Researcher, ML Engineer, and Business Analyst, among others.

This is likely due to the employees’ job titles compared to their responsibilities within the company for the projects they work on. Especially, the smaller the company, the greater the difference between the profile and roles of an employee.

Therefore, this article summarizes as accurately as possible some of the more “common” data roles. To do so, the article first describes the data roles shown below, then diagrams the main connections between them and summarizes the main differences between the roles that can create confusion.

Figure 1. Data roles. Ref: Image by author.

Data roles

  • Data analyst: Answers questions by analyzing the data and interpreting the results.
  • Business analyst: Processes, interprets and documents business processes, products, services and software through analysis of data to help to form business insights and make more effective business decisions.
  • Data visualization engineer: Design data reporting solutions that simplify the understanding of the data and its insights.
  • Data scientist: Focuses on solving problems by developing AI-based models more than giving answers to questions.
  • AI researcher: Explore and propose new systems of solving problems that are then used by other roles for real-world scenarios.
  • Data engineer: Builds data pipelines to bring together information from different source systems.
  • Data architect: Defines the policies, procedures, models and technologies to be used in collecting, organizing, storing and accessing company information [1].
  • ML engineer: Builds and maintains artificial intelligence systems to automate predictive models.
  • MLOps engineer: Provides data scientists and other roles with access to the specialized tools and infrastructure (e.g., storage, distributed computing, GPUs, etc.) they need across the data lifecycle. They develop the methodologies to balance unique data science requirements with those of the rest of the business to provide integration with existing processes and CI/CD pipelines [2].

Data roles diagram

If we connect the roles based on their needs and dependencies, we end up with the diagram shown in Figure 2. Below is a description of each block as well as the responsibilities of each role when asked to develop a forecasting model (illustrative example).

Figure 2. Main relationships between data roles. Ref: Image by author.

1. Data preparation:

Data architects plan the organization’s data framework and discuss it with the data engineers, who implement it. Data engineers then create the ETL pipelines that generate the data used by data scientists, data analysts, and business analysts.

For the forecasting task, these two roles would be in charge of building the data framework to collect the data, as well as developing the ETP pipelines to create the dataset.

2. Data modeling:

Both data analysts and business analysts work with data, the main difference lies in what they do with it. Business analysts use data to help organizations make more effective business decisions. In contrast, data analysts are more interested in collecting and analyzing data to give answers to specific questions [4].

In our example, the required role would be the data analyst, who will perform the explanatory data analysis to get the insights.

3. Data analysis:

AI researchers investigate new AI-based methods, which are usually published in an open-access archive such as arXiv or conference proceedings. Data scientists are the ones who mainly use this information to solve their tasks.

In our example, the data scientist would develop the forecasting model based on the current literature, which comes from the research of AI researchers.

4. Deployment and monitoring:

Once the data scientists have built a model, the machine learning engineers work to ship the model into a production environment. Additionally, data visualization engineers design the interface to report the solutions.

For the forecasting task, the ML engineers would take the model developed by the data scientists, optimize it, and put it in production. ML engineers do not necessarily need to understand how the model works. Data visualization engineers would design the interface that is displayed to users.

5. Infrastructure:

MLOps Engineers enable the other roles by building and maintaining a platform to enable the development and deployment of machine learning models.

In the given example, these engineers would develop all the specialized tools and infrastructure (e.g., storage, distributed computing, GPUs, etc.) needed throughout the data science lifecycle.

Data roles differences

This last section focuses on mentioning the differences between some of the most difficult roles to differentiate. The task of creating a forecasting model will be also used as an illustrative example to compare the roles.

Data architect vs Data engineer

Data architects design the vision and blueprint of the organization’s data framework, while the data engineer is responsible for creating that vision [3].

For the forecast example, the data architect would focus on defining the policies, procedures, models, and technologies, as well as providing insights to the data engineers on how to organize the data and in what format the data should be presented. Thereafter, the data architect relies on the concrete vision of the data engineer to collect the data, store it in the system, and prepare it for analysis.

Business analyst vs Data analyst

While data analysts and business analysts both work with data, the main difference lies in what they do with it. Business analysts use data to help organizations make more effective business decisions. In contrast, data analysts are more interested in gathering and analyzing data to evaluate and use to make decisions on their own. [4].

For the given example, data analysts would be responsible for modeling forecasting problems, discovering insights, and identifying opportunities through the use of statistical, data mining, and visualization techniques. In contrast, business analysts would apply a breadth of tools, data sources, and analytical techniques to answer a wide range of high-impact business questions and present insights concisely and effectively. Examples of such business-critical insights include providing estimates of new programs/products and market launches and diving deep into recent events to understand the impact on contact volume.

Data analyst vs Data scientist

Data analysts are generally associated with giving answers to questions, while data scientists focus on solving problems by developing AI-based models using the found insights of the data.

For the forecast example, data analysts would focus on discovering insights into the data by using methods such as stationarity, correlation, autocorrelation, multicollinearity, etc. On the other hand, data scientists, based on the insights of the data analysts, would focus on building the model, probably implementing an LSTM neural network, transformer-based neural network, or a 1D convolutional neural network, among others.

Data scientist vs AI researcher

AI researchers explore new ways and propose new systems of solving problems, while data scientists tweak and apply these systems in real-world scenarios [5].

For the forecast task, there is a clear gap between both roles as AI researchers would not implement this type of task as their job is to work with publicly available datasets so they can compare their proposed methods with state-of-the-art ones. This task would be given to data scientists, who would use the SOTA models to find the optimal solution.

Data scientist vs ML Engineer

Data scientists deal with the modeling side of the algorithm, whereas ML engineers focus on the deployment of that same model. Data scientists focus on the ins and outs of the algorithms, while machine learning engineers work to ship the model into a production environment [6].

For the forecast task, data scientists focus on building the forecast model, trying to decrease the error as much as possible. Once finished, the ML engineer is in charge of shipping this model to production, retraining it, and maintaining it.

ML Engineer vs ML DevOps

ML Engineers focus on implementing and retraining machine learning models, whereas MLOps Engineers enable ML Engineers by building and maintaining a platform to enable the development and deployment of machine learning models [7].

For the given example, the role of ML Engineers is to ship the forecasting model to production in the infrastructure maintained by ML DevOps, who may be asked to install, for example, specific libraries or CUDA versions.

References

[1] https://www.techtarget.com/whatis/definitions/D

[2] https://blog.dominodatalab.com/7-roles-in-mlops

[3] Striim, Data Architect vs Data Engineer

[4] Northeastern University, Data Analyst vs Business Analyst: What’s the Difference?

[5] Intern Khoj, AI Research

[6] Medium, Data Scientist vs Machine Learning Engineer Skills. Here’s the Difference

[7] Neptune blog, Is MLOps Engineer a thing?

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment