Course info
Aug 9, 2018
1h 38s

At the core of applied machine learning is a thorough knowledge of data wrangling. In this course, Data Wrangling with Pandas for Machine Learning Engineers, you will learn how to massage data into a modellable state. First, you will discover what data wrangling is and its importance to the machine learning process. Next, you will explore the Pandas DataFrame and see how data is manipulated within the DataFrame. Finally, you will learn how to build an accurate model with the cleansed dataset. When you are finished with this course, you will have a foundational knowledge of data wrangling that will help you as you move forward to becoming a machine learning engineer.

About the author
About the author

Mike has Bachelor of Science degrees in Business and Psychology. He's passionate about machine learning and data engineering.

More from the author
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hello. My name is Mike West, and welcome to my course Data Wrangling with Pandas for Machine Learning Engineers. While artificial neural networks are getting all the attention, one of the most overlooked aspects of machine learning is the data. Regardless of the algorithm type, almost all machine learning models need well formatted, structured data to perform optimally. It's the job of the machine learning engineer to wrangle the data into a modellable state. Data wrangling is one of the most difficult and time-consuming parts of machine learning. In the real world, data is dirty and machine learning models are temperamental. These models only want highly-structured, well-cleansed data. In this course, we'll provide you with the foundation you need to wrangle those unruly datasets. The course will introduce you to applied data wrangling. You'll want to have developers take real-world datasets and wrangle them to highly-structured numerical entities machine learning models need. The core library used by machine learning engineers to wrangle their data in Python is called pandas. You'll learn how to manipulate tabular data in an array. The array is the core data object in machine learning. Once the data has been properly wrangled, you'll build a highly accurate model that will predict a person's survivability if they were aboard the Titanic at the time of the sinking. Python has become the gold standard in applied machine learning, and a library called pandas, the preferred tool utilized by developers to massage their data into a well-cleansed state. By the end of the course, you'll be familiar with the basics of data wrangling and the process machine learning engineers use to create well-cleansed, model-ready datasets. I hope you will join me on this journey to learn more about data wrangling with Python at Pluralsight.