Pandas Playbook: Manipulating Data

Pandas is one of the most popular software packages for data analysis. This course focuses on the core functionalities of Pandas for data wrangling, teaching you how to tackle everyday tasks for a data analyst, or data scientist.
Course info
Rating
(27)
Level
Intermediate
Updated
May 24, 2018
Duration
2h 16m
Table of contents
Description
Course info
Rating
(27)
Level
Intermediate
Updated
May 24, 2018
Duration
2h 16m
Description

Pandas is not just one of the most popular software packages for data analysis, it is also, without a doubt, the most convenient and fun way to work with your data. In this course, Pandas Playbook: Manipulating Data, you will cover the most important core functionalities of Pandas, focusing on the core functionalities of the two main Pandas classes: the DataFrame and the Series. First, you will take a look at a new dataset and try to get a feeling for it - how many rows and columns are there? What datatypes does it consist of? You will do some basic statistical exploration as well. Then, you'll focus on getting information out of your dataset. Basically, it's about asking the right questions and drilling down into your dataset. Finally, you will learn how to clean and transform your data. Here, you will see how to run Python functions against our data, including functions we write ourselves by using a very cool and powerful feature called groupby() - changing the structure of our columns and rows, and combining multiple dataframes into one. After watching this course, you will be ready for just about any data wrangling job that you might come across.

About the author
About the author

After years of working in software development, Reindert-Jan Ekker has decided to pursue another passion of his: education. He currently works as a college professor of Computer Science in the Netherlands, teaching many subjects like web development, algorithms and data structures and Scrum.

More from the author
Shell Scripting with Bash
Intermediate
4h 33m
Sep 16, 2019
Python Best Practices for Code Quality
Intermediate
1h 11m
May 17, 2019
More courses by Reindert-Jan Ekker
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi everyone, my name is Reindert-Jan Ekker, and welcome to this playbook about manipulating data with Pandas. I'm a senior developer and freelance educator, and in this course I'll teach you about Pandas, the most popular Python framework for doing data science and analysis. The role of Pandas in data analysis has been growing hard over the past few years, and you simply cannot go without Pandas anymore. This course goes over the core tasks that you will need to perform when working with any real-world dataset, and we'll do so in a very hands-on way. Some of the major topics that we will cover include exploring a new dataset; selecting, sorting, and filtering your data so that you can drill down into your dataset to answer specific questions; cleaning a dataset, which means doing things like fixing bad or missing data points and removing outliers; and transforming your data either by doing calculations on it, or by changing the structure of your dataset. By the end of this course, you'll have a good understanding of the core functionality of a Pandas DataFrame, and you'll be able to handle most everyday tasks. Before beginning the course, you should be familiar with the very basics of Python and data science. I hope you'll join me on this journey to learn how to manipulate your data with Pandas with the Pandas Playbook: Manipulating Data at Pluralsight.