Author avatar

Michael Taschler

Cloud Certifications: Microsoft Certified Azure Data Scientist Associate

Michael Taschler

  • Jun 29, 2020
  • 10 Min read
  • 1,244 Views
  • Jun 29, 2020
  • 10 Min read
  • 1,244 Views

Introduction

Demand for cloud data has been booming and this won't stop any time soon. Organizations have access to more data and storing that data has become cheaper and easier. While previously limited to only major organizations, the cloud has also lowered the entry point to these technologies, allowing smaller ones to avail of these services and at a significantly lower up-front investment.

Processing and presenting this data is a challenge, and that is where you come in. As an Azure Data Scientist you will set up an Azure Machine Learning workspace, run experiments and train models, optimize and manage models, and deploy and consume models.

In this guide you will learn about the Microsoft Azure Data Scientist Associate certification and what exam you can take to achieve it.

Azure Administrator Associate Badge Image

The Azure Data Scientist Associate certification follows Microsoft's departure from more broad certifications like the Microsoft Certified Systems Administrator (MCSA) or its older sibling, the Microsoft Certified Systems Engineer (MCSE). Nowadays the focus is on specific roles. Please note that this exam is currently in beta.

Target Audience

As the name suggests, this certification has "data science" written all over it, specifically Azure data science offerings and features. Since it is an Associate-level certification, the required exam covers a wide range of data science topics and technologies.

As an Azure Data Scientist you will apply your knowledge of data science and machine learning to implementing and running machine learning workloads on Azure; in particular, using Azure Machine Learning Service. This entails planning and creating a suitable working environment for data science workloads on Azure, running data experiments and training predictive models, managing and optimizing models, and deploying machine learning models into production.

Applicable Exams

A single exam is required to gain the Azure Data Scientist Associate certification. It is important to understand that Microsoft has taken up the practice of retiring and replacing exams at a much faster pace than in the past. Since the cloud is ever changing Microsoft updates live exams frequently. The DP-100 exam has received two updates since the beginning of the year.

The price for the exam is US$165/€165. Microsoft offers a student discount if you verify your academic status when booking the exam by using one of the following: a school email account, a school account, an International Student Identity Card, a verification code. Alternatively you can also supply documentation proving your eligibility for the student discount.

Prerequisites

While there are no specific prerequisites to achieving this certification beyond passing the DP-100 exam, it is worth noting that experience with the required skills is key to a successful experience. Having passed the DP-900 Microsoft Azure Data Fundamentals exam and achieved the corresponding Microsoft Azure Data Fundamentals certification, while not mandatory, will help you prepare for this level since it introduces a number of technologies covered in the DP-100 Designing and Implementing a Data Science Solution on Azure exam.

Learning Path for Azure Data Scientist Associate

Ensure that you possess sufficient experience and invest the time to go through the relevant Pluralsight courses and other resources.

Skills Measured

Your skills will be measured in the following four categories:

  • Set up an Azure Machine Learning workspace (30-35%)
  • Run experiments and train models (25-30%)
  • Optimize and manage models (20-25%)
  • Deploy and consume models (20-25%

These categories are broken down into details as follows:

Set up an Azure Machine Learning Workspace

Create an Azure Machine Learning workspace

  • Create an Azure Machine Learning workspace
  • Configure workspace settings
  • Manage a workspace by using Azure Machine Learning studio

Manage data objects in an Azure Machine Learning workspace

  • Register and maintain data stores
  • Create and manage datasets

Manage experiment compute contexts

  • Create a compute instance
  • Determine appropriate compute specifications for a training workload
  • Create compute targets for experiments and training

Run Experiments and Train Models

Create models by using Azure Machine Learning Designer

  • Create a training pipeline by using Azure Machine Learning designer
  • Ingest data in a designer pipeline
  • Use designer modules to define a pipeline data flow
  • Use custom code modules in designer

Run training scripts in an Azure Machine Learning workspace

  • Create and run an experiment by using the Azure Machine Learning SDK
  • Consume data from a data store in an experiment by using the Azure Machine Learning SDK
  • Consume data from a dataset in an experiment by using the Azure Machine Learning SDK
  • Choose an estimator for a training experiment

Generate metrics from an experiment run

  • Log metrics from an experiment run
  • Retrieve and view experiment outputs
  • Use logs to troubleshoot experiment run errors

Automate the model training process

  • Create a pipeline by using the SDK
  • Pass data between steps in a pipeline
  • Run a pipeline
  • Monitor pipeline runs

Optimize and Manage Models

Use Automated machine learning (ML) to create optimal models

  • Use the Automated ML interface in Azure Machine Learning studio
  • Use Automated ML from the Azure Machine Learning SDK
  • Select scaling functions and pre-processing options
  • Determine algorithms to be searched
  • Define a primary metric
  • Get data for an Automated ML run
  • Retrieve the best model

Use Hyperdrive to tune hyperparameters

  • Select a sampling method
  • Define the search space
  • Define the primary metric
  • Define early termination options
  • Find the model that has optimal hyperparameter values

Use model explainers to interpret models

  • Select a model interpreter
  • Generate feature importance data

Manage models

  • Register a trained model
  • Monitor model history
  • Monitor data drift

Deploy and Consume Models

Create production compute targets

  • Consider security for deployed services
  • Evaluate compute options for deployment

Deploy a model as a service

  • Configure deployment settings
  • Consume a deployed service
  • Troubleshoot deployment container issues

Create a pipeline for batch inferencing

  • Publish a batch inferencing pipeline
  • Run a batch inferencing pipeline and obtain outputs

Publish a designer pipeline as a web service

  • Create a target compute resource
  • Configure an Inference pipeline
  • Consume a deployed endpoint

Pluralsight Courses

Make sure you check out Pluralsight's Microsoft Azure Data Scientist (DP-100) learning path, which currently contains 25 different courses split into beginners, intermediate, and advanced sections.

As always, the newer the course the more relevant the material will be to your learning journey.

Other Resources

Microsoft Learn provides several training resources free of charge. Take a look at the following learning paths:

Utilizing Microsoft Docs and navigating to the relevant topics will also enable you to prepare for this exam.

Compensation and Employment Outlook

The cloud business has been booming in the last several years. Microsoft has closed the gap with its main competitor and keeps growing. While COVID-19 has affected everyone in some way, it certainly doesn't seem to have had a negative impact on Microsoft's cloud growth.

Gaining an up-to-date certification like the Azure Data Scientist Associate certification from a household name like Microsoft should make you much more attractive to both your current and future employers, especially since the cloud is booming. Your current employer might not raise your salary, but the next time you go looking for a job make sure you check trusted Internet sources for up-to-date information on salaries in your region.

It's difficult to provide absolute figures because they will depend on numerous factors like your experience, company type and size, industry, and region. Expect salaries for experienced data scientists to range from US$120,000 to US$175,000 in the United States.

Conclusion

As an Associate-level certification, gaining the Azure Data Scientist Associate credentials, while challenging, would earn you the recognition to prove that you are a subject matter expert in this field. All it takes is a single exam. Sign up to Microsoft Azure, utilize the free cloud credits and services and book the exam, which you can take right in your home or in one of many testing centers.

I hope that this guide is useful and wish you good luck with gaining your certification.

28