Engineering Features for Machine Learning in Microsoft Azure

Paths

Engineering Features for Machine Learning in Microsoft Azure

Authors: Mike West, Ravikiran Srinivasulu, Michael Heydt, David Tucker, Steph Locke

Feature engineering is the process of using domain knowledge and insight into data to define features that enable machine learning algorithms to work successfully. Feature... Read more

What you will learn

  • Build features from numerical and nominal data
  • Reduce the complexity in a data set
  • Extract features from text documents or images

Pre-requisites

This path is intended for experienced machine learning engineers or data miners looking to apply their knowledge inside of the Microsoft Azure platform.

Beginner

In this section of the path, you will learn to apply simple feature engineering techniques using Microsoft Azure.

Building Features from Nominal and Numeric Data in Microsoft Azure

by Mike West

Nov 7, 2019 / 1h 20m

1h 20m

Start Course
Description

At the core of applied machine learning is data. In this course, Building Features from Nominal and Numeric Data in Microsoft Azure, you will learn how to cleanse data within the confines of Azure Machine Learning Service. First, you will discover the sundry options you have within Azure Machine Learning Service for building your models end to end. Next, you will explore the importance of applying statistical techniques to your data to improve model performance. Finally, you will learn how to apply various data cleansing techniques to your data for enhancing real-world performance. When you are finished with this course, you will have a foundational knowledge of Azure Machine Learning Service and a solid understating of how to apply statistical techniques to your data that will help you as you move forward to becoming a machine learning engineer.

Table of contents
  1. Course Overview
  2. Setting the Stage
  3. Approaching Normalization and Standardization
  4. Defining Normalization and Standardization Techniques
  5. Leveraging Nominal Data in Machine Learning

Preparing Data for Feature Engineering and Machine Learning in Microsoft Azure

by Ravikiran Srinivasulu

Dec 16, 2019 / 2h 20m

2h 20m

Start Course
Description

Data comes from many different sources. So when you join them, they are naturally inconsistent. In this course, Preparing Data for Feature Engineering and Machine Learning in Microsoft Azure, you will be taken on a journey where you begin with data that's unsuitable for machine learning and use different modules in Azure Machine Learning to clean and preprocess the data. First, you will learn how to set up the data and workspace in Azure Machine Learning. Next, you will discover the role of feature engineering in machine learning. Finally, you will explore how to Identify specific data-level issues for machine learning models. When you’re finished with this course, you will have a clean dataset processed with azure machine learning modules that’s ready to build production-ready machine learning models.

Table of contents
  1. Course Overview
  2. Getting Started with Azure Machine Learning
  3. Differentiating Data, Features, Targets, and Models
  4. Preparing Input Data for Machine Learning Models
  5. Handling Missing Data
  6. Role of Feature Engineering in Machine Learning
  7. Split a Data Set into Training and Testing Subsets
  8. Identify Data-level Issues In Machine Learning Models

Intermediate

In this section of the path, you will learn to extract features from text documents using Microsoft Azure.

Building Features from Text Data in Microsoft Azure

by Michael Heydt

Dec 17, 2019 / 1h 55m

1h 55m

Start Course
Description

Using text data to make decisions is key in creating text features for machine learning models. In this course, Building Features from Text Data in Microsoft Azure, you'll obtain the ability to structure your data several ways that are usable in machine learning models using Microsoft Azure Machine Learning Service virtual machines. First, you’ll discover how to use natural language processing to prepare text data, and how to leverage several natural language processing technologies, such as document tokenization, stopword removal, frequency filtering, stemming and lemmatization, parts-of-speech tagging, and n-gram identification. Then, you’ll explore documents as text features, where you'll learn to represent documents as feature vectors by using techniques including one-hot and count vector encodings, frequency based encodings, word embeddings, hashing, and locality-sensitive hashing. Finally, you'll delve into using BERT to generate word embeddings. By the end of this course, you'll have the skills and knowledge to use textual data and Microsoft Azure in conceptually sound ways to create text features for machine learning models.

Table of contents
  1. Course Overview
  2. Processing and Simplifying Text to Simplify Feature Creation
  3. Building Features Around Text Data for Use in Machine Learning Models

Advanced

In the final section of this skill, you will learn how to extract features from images, and how to apply techniques of complexity reduction, such as PCA, to your data inside of Microsoft Azure.

Building Features from Image Data in Microsoft Azure

by David Tucker

Sep 19, 2019 / 1h 39m

1h 39m

Start Course
Description

Computer vision enables insights and experiences that previously weren’t possible, but it can seem daunting to know how to extract the information you need out of an image. In this course, Building Features from Image Data in Microsoft Azure, you will learn how to leverage the tools and services provided by Microsoft Azure alongside popular computer vision and deep learning frameworks to extract relevant information from images. First, you will explore computer vision, its use cases, and also take a look at what Azure provides to make this easier for you. Next, you will learn about the algorithmic approach to computer vision by reviewing popular feature descriptors like the scale-invariant feature transform and the histogram of oriented gradients. Finally, you will delve into deep learning as a tool to leverage in computer vision by creating a convolutional neural network to classify images. When you are finished with this course, you will have both the knowledge and tools to build features out of your image data on Microsoft Azure.

Table of contents
  1. Course Overview
  2. Exploring Computer Vision on Azure
  3. Utilizing the SIFT and HOG Algorithms for Feature Detection
  4. Leveraging Convolutional Neural Networks for Feature Extraction

Reducing Complexity in Data in Microsoft Azure

by Steph Locke

Dec 10, 2019 / 2h 14m

2h 14m

Start Course
Description

If you're building models for data science, your feature sets can quickly become complicated and hard to understand. In this course, Reducing Complexity in Data in Microsoft Azure, you will learn how to reduce the complexity of feature sets, making models more understandable, more straightforward to build, and more robust. First, you will learn to understand feature set complexity and how it impacts your models. Next, you will discover a range of different techniques to improve the complexity of your feature sets. Finally, you will explore various advanced methods for feature set complexity reduction. When you are finished with this course, you will have the skills and knowledge needed to reduce the complexity of your models, and create more straightforward and manageable models, leading to better and more consistent insights into your data.

Table of contents
  1. Course Overview
  2. Understanding How Feature Set Complexity Impacts Model Quality
  3. Applying Criteria-based Feature Reduction Techniques
  4. Using Principal Component Analysis to Reduce Numeric Feature Sets
  5. Processing Categorical or Text Feature Sets
  6. Going beyond PCA to Reduce Complexity in Numeric Feature Sets