Doing Data Science with Python

This course shows you how to work on an end-to-end data science project including processing data, building & evaluating machine learning model, and exposing the model as an API in a standardized approach using various Python libraries.
Course info
Rating
(150)
Level
Beginner
Updated
Dec 28, 2017
Duration
6h 25m
Table of contents
Course Overview
Course Introduction
Setting up Working Environment
Extracting Data
Exploring and Processing Data - Part 1
Exploring and Processing Data - Part 2
Exploring and Processing Data - Part 3
Building and Evaluating Predictive Models – Part 1
Building and Evaluating Predictive Models – Part 2
Description
Course info
Rating
(150)
Level
Beginner
Updated
Dec 28, 2017
Duration
6h 25m
Description

Do you want to become a Data Scientist? If so, this course will equip you with concepts and tools that can bring you to speed and you can utilize the skills acquired in this course to work on any data science project in a standardized approach. This course, Doing Data Science with Python, follows a pragmatic approach to tackle end-to-end data science project cycle right from extracting data from different types of sources to exposing your machine learning model as API endpoints that can be consumed in a real-world data solution. This course will not only help you to understand various data science related concepts, but also help you to implement the concepts in an industry standard approach by utilizing Python and related libraries. First, you will be introduced to the various stages of a typical data science project cycle and a standardized project template to work on any data science project. Then, you will learn to use various standard libraries in the Python ecosystem such as Pandas, NumPy, Matplotlib, Scikit-Learn, Pickle, Flask to tackle different stages of a data science project such as extracting data, cleaning and processing data, building and evaluating machine learning model. Finally you'll dive into exposing the machine learning model as APIs. You will also go through a case study that will encompass the whole course to learn end-to-end execution of a data science project. By the end of this course, you will have a solid foundation to handle any data science project and have the knowledge to apply various Python libraries to create your own data science solutions.

About the author
About the author

Abhishek Kumar is a data science consultant, author and speaker.He holds Master's degree from University of California, Berkeley.His focus area is machine learning & deep learning at scale.

More from the author
R Programming Fundamentals
Beginner
7h 0m
Oct 18, 2014
More courses by Abhishek Kumar
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi everyone. My name is Abhishek Kumar with Pluralsight, and welcome to my course on Doing Data Science with Python. Data science is one of the hottest fields these days, and no wonder data scientist has been termed as the sexiest job of the century, because with the help of data science you can unravel meaningful insights, and generate data-drive evidences that can benefit organizations in a significant way, and provide them a competitive edge. So if you also want to make a jump-start in this fascinating field of data science, or if you are already in this field and want to learn the standardized way of tackling end-to-end data science project cycle using Python, then this course is for you. In this course, we will dive into various phases of a data science project such as data extraction, data processing and visualization, building, evaluating, and fine-tuning predictive models, and finally, exposing your predictive models as APIs for real-time integration. The only prerequisite to this course is that you should be familiar with the basics of programming with Python. This course will not only help you to build the foundations of data science, but to also help you learn to implement the concepts through lots and lots of demos. by the end of this course, you'll be in a position of knowledge and skills necessary to kick start your data science journey in the Python world. So please join me in this very exciting course on Doing Data Science with Python, at Pluralsight.

Setting up Working Environment
Hi, this is Abhishek Kumar, and welcome to the second module of the course on doing data science with Python. For tackling any data science project, you need to have a working environment, and Python provides a very rich set of tools to create such an environment very quickly, so in this module you will learn to set up a working environment that we will use for this course. But you can use similar working environments for your future data science projects as well.

Extracting Data
Hi, this is Abhishek Kumar, and welcome to the third module of the course on doing data science with Python. Well, the journey of any data science project starts with gathering or extracting data, so this module will be focused towards the data extraction phase. However, the data can come from different types of sources. It can be available to files in different formats that you can download manually or programmatically. Sometimes, data may also have to be scraped from website or web pages, but it can be available to API calls. Other data may already be available directly in your company's database that you can hook into. But sometimes you may have to collect data manually using forms or surveys. But if the situation demands, you may have to extract data directly through some device using available ports. So as you can see, depending upon your problem, you may have to acquire data from different types of sources. Even though the types of data sources can be a huge list, we will look at some most common types of data acquisition tasks in this module that you can typically face in your data science journey.

Exploring and Processing Data - Part 2
Hi, this is Abhishek Kumar, and welcome to the fifth module of the course on Doing Data Science with Python. This module is the second part of the Exploring and Processing Data modules. Just a quick recap, from the data science project cycle view, we have already covered our data extraction phase and have extracted our titanic disaster dataset. And in the previous module, that was the first part of the exploring and processing data modules, we discussed about the Organize tab, and had some basic idea about our titanic dataset, such as the basic structure and summary statistics. Well, we are still at the organize step in the data science project cycle. And if we go deeper in the organize step, we are still at the exploratory data analysis phase, and the focus of this module will remain on the EDA phase only. We will start from where we have left in the previous module, and we'll continue our data exploration journey. And along the way, we will learn some very interesting EDA techniques. Also, just to reiterate, we'll be covering the data munging, feature engineering, and advanced visualization in the next module.

Building and Evaluating Predictive Models – Part 2
Hi, this is Abhishek Kumar, and welcome to the eighth and the final module of the course on Doing Data Science with Python. This is also the second part in the two-part series of modules on building and evaluating predictive models. In the first part, we built our machine learning foundation and learned to build predictive models. In this module, you will learn some more advanced topics related to predictive modeling, and we'll fine-tune our predictive model that we built in the last module. Just to reiterate, if you look at the data science project cycle, we are in the modeling and present phase, and we will wrap up our journey in this module. If you drill down in the modeling phase, in this module, we will proceed from the point where we have left our predictive modeling journey in the previous module. So we'll fine-tune the predictive model that we had built in the previous module. We will also learn about model persistence to save your train model for future use. In the presentation front, we will create our next version of the submission file for the Kaggle Titanic disaster challenge after fine-tuning our machine learning model. Towards the end, you will also learn to use Python to create an API through which you can expose your machine learning models to external systems. This will enable your machine learning model to be a part of any full-fledged data solution.