Expanded Library

Cleaning Data with Pandas

by Pratheerth Padman

Learn to clean and manipulate data using the Pandas library in Python. Cover common issues like missing values and irrelevant features, use correlation analysis, encode categorical features, and prepare data for machine learning models.

What you'll learn

In the real world, rarely is data organized into neat tables that can be fed directly into a machine learning model or used for data analysis. Data you find is often messy, missing many values, and generally tends to have multiple other issues that you need to solve before gaining any sort of meaningful inference from it.

In this course, Cleaning Data with Pandas, you will learn how to use the Pandas library in Python to clean and manipulate data.

First, you will understand what data cleaning is and why it is so important in the context of data analysis. Then, you will solve the most common issues plaguing datasets - missing values, irrelevant features, and duplicate values.

Next, you will see what correlation analysis is and how it helps in data cleaning.

Finally, you will see how to encode categorical features and prepare your dataset to be fed into machine learning models.

When you’re finished with this course, you will have the skills and knowledge you need to effectively clean and manipulate data using Pandas.

About the author

Pratheerth is a Data Scientist who has entered the field after an eclectic mix of educational and work experiences. He has a Bachelor's in Engineering in Mechatronics from India, Masters in Engineering Management from Australia and then a couple of years of work experience as a Production Engineer in the Middle East. Then when the A.I bug bit him, he dropped everything to dedicate his life to the field. He is currently working on mentoring, course creation and freelancing as a Data Scientist.

Ready to upskill? Get started