Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.
  • Labs icon Lab
  • A Cloud Guru
Azure icon

Exploring AML Designer Transforms: Clean Missing Data

A large amount of time for machine learning tasks is spent understanding the data and getting it into the proper configuration to train the model. This is the data wrangling, exploration, and cleaning phase of the machine learning life cycle. In Azure Machine Learning designer, many common data-changing operations are provided as transform modules. In this lab, you will explore the *Clean Missing Data* module to gain a deeper understanding of the tools at your disposal.

Azure icon

Path Info

Clock icon Advanced
Clock icon 15m
Clock icon Sep 24, 2020

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Table of Contents

  1. Challenge

    Set Up the Workspace

    Log in and go to the Azure Machine Learning Studio workspace provided in the lab.

    Create a Training Cluster of Standard_DS1_v2 instances.

    Create a new blank Pipeline in the Azure Machine Learning Studio Designer.

  2. Challenge

    Explore Clean Missing Data

    Add an Adult Census Income Binary Classification dataset node to the pipeline. Visualize this raw data to see what data is missing.

    Find a column that is only missing a value in under 5% of the data. You will need to find the total row count and how many values are missing in each column. All of this information is provided on the Visualize popup.

    Using a Clean Missing Data transformation, remove the rows which are missing data in the chosen column.

    Submit the pipeline to perform the transformation.

  3. Challenge

    Visualize the Transformed Data

    When the pipeline finishes, inspect the output of the Clean Missing Data node. How have the column statistics changed?

    You can continue to chain the Clean Missing Data nodes to clean other columns. You can also select multiple columns to clean at the same time if you want to apply the same operation with the same threshold values on those columns.

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.

Start learning by doing today

View Plans