Course

Preparing Data for Machine Learning

This course covers important techniques in data preparation, data cleaning and feature selection that are needed to set your machine learning model up for success. You will also learn how to use imputation to deal with missing data and strategies for identifying and coping with outliers.

Intermediate

3h 24m

(58)

Created by Janani Ravi

Last Updated Feb 05, 2020

Get started today

Access this course and other top-rated tech content with one of our business plans.

Start a free team trial

Buy now

Try this course for free

Access this course and other top-rated tech content with one of our individual plans.

Start a free trial

Buy now

This course is included in the libraries shown below:

AI
Data

Course

Preparing Data for Machine Learning

Intermediate

3h 24m

(58)

Created by Janani Ravi

Last Updated Feb 05, 2020

Get started today

Access this course and other top-rated tech content with one of our business plans.

Start a free team trial

Buy now

Try this course for free

Access this course and other top-rated tech content with one of our individual plans.

Start a free trial

Buy now

This course is included in the libraries shown below:

AI
Data

What you'll learn

As Machine Learning explodes in popularity, it is becoming ever more important to know precisely how to prepare the data going into the model in a manner appropriate to the problem we are trying to solve.

In this course, Preparing Data for Machine Learning* you will gain the ability to explore, clean, and structure your data in ways that get the best out of your machine learning model.

First, you will learn why data cleaning and data preparation are so important, and how missing data, outliers, and other data-related problems can be solved. Next, you will discover how models that read too much into data suffer from a problem called overfitting, in which models perform well under test conditions but struggle in live deployments. You will also understand how models that are trained with insufficient or unrepresentative data suffer from a different set of problems, and how these problems can be mitigated.

Finally, you will round out your knowledge by applying different methods for feature selection, dealing with missing data using imputation, and building your models using the most relevant features.

When you’re finished with this course, you will have the skills and knowledge to identify the right data procedures for data cleaning and data preparation to set your model up for success.

Preparing Data for Machine Learning

Intermediate

3h 24m

(58)

Table of contents

Course Overview | 1m 48s

About the author

Janani Ravi

200 courses

4.5 author rating

6281 ratings

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

More Courses by Janani

Preparing Data for Machine Learning

Preparing Data for Machine Learning

Get started today

Try this course for free

Preparing Data for Machine Learning

What you'll learn

Preparing Data for Machine Learning

Course Overview 1m

Understanding the Need for Data Preparation 40m

Implementing Data Cleaning and Transformation 47m

Transforming Continuous and Categorical Data 46m

Understanding Feature Selection 24m

Implementing Feature Selection 44m

2025 Forrester Wave™ names Pluralsight as a Leader among tech skills dev platforms