Simple play icon Course
Skills

Manage Invalid, Duplicate, and Missing Data in Python

by Axel Sirota

Cleaning data is one of those tasks that is not fancy, but key to any data application. This course will teach you the skills and knowledge of data cleaning in Pandas needed to convert your datasets from raw and useless to clean and useful.

What you'll learn

Regardless of your line of work; data is everywhere. Today, we generate more data per second than ever before; however, this data is usually raw, dirty, and frequently unusable.

In this course, Manage Invalid, Duplicate, and Missing Data in Python, you’ll gain the ability to clean your data to make it usable for any application you may need.

First, you’ll explore how to handle missing values and how to fill NaN columns.

Next, you’ll discover how to deal with duplicate rows on a subset of columns.

Finally, you’ll learn how to cope with invalid values and how to fix or remove them.

When you’re finished with this course, you’ll have the skills and knowledge of data cleaning in Pandas needed to convert your datasets from raw and useless to clean and useful.

About the author

Axel Sirota is a Microsoft Certified Trainer with a deep interest in Deep Learning and Machine Learning Operations. He has a Masters degree in Mathematics and after researching in Probability, Statistics and Machine Learning optimisation, he works as an AI and Cloud Consultant as well as being an Author and Instructor at Pluralsight, Develop Intelligence, and O'Reilly Media.

Ready to upskill? Get started