Representing, Processing, and Preparing Data

This course covers the different data processing tools - including spreadsheets, Python, and relational databases - and deals with data quality issues and visualizing data for insight generation.
Course info
Level
Beginner
Updated
Jun 19, 2019
Duration
2h 45m
Table of contents
Course Overview
Understanding Data Cleaning and Preparation Techniques
Preparing Data for Analysis Using Spreadsheets and Python
Collecting Data to Extract Insights
Loading and Processing Data Using Relational Databases
Representing Insights Obtained from Data
Description
Course info
Level
Beginner
Updated
Jun 19, 2019
Duration
2h 45m
Description

Data science and data modeling are fast emerging as crucial capabilities that every enterprise and every technologist must possess these days. As the process of actually constructing models becomes democratized, the general view is shifting toward using the right data and using the data right. In this course, Representing, Processing, and Preparing Data, you will gain the ability to correctly represent information from your domain as numeric data, and get it into a form where the full capabilities of models can be leveraged. First, you will learn how outliers and missing data can be dealt with in a theoretically sound manner. Next, you will discover how to use spreadsheets, programming languages and relational databases to work with your data. You will see the different types of data that you may deal with in the real world and how you can collect and integrate data to a common destination to eliminate silos. Finally, you will round out the course by working with visualization tools that allow every member of an enterprise to work with data and extract meaningful insights. When you are finished with this course, you will have the skills and knowledge to use the right data sources, cope with data quality issues and choose the right technologies to extract insights from your enterprise data.

About the author
About the author

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

More from the author
Building Features from Image Data
Advanced
2h 10m
Aug 13, 2019
Designing a Machine Learning Model
Intermediate
3h 25m
Aug 13, 2019
More courses by Janani Ravi
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi, my name is Janani Ravi, and welcome to this course on representing, processing, and preparing data. A little about myself. I have a master's degree in electrical engineering from Stanford and have worked at companies such as Microsoft, Google, and Flipkart. At Google, I was one of the first engineers working on real-time collaborative editing in Google Docs, and I hold four patents for its underlying technologies. I currently work on my own startup, Loonycorn, a studio for high quality video content. Data science and data modeling are fast emerging as crucial capabilities that every enterprise and every technologist must possess these days. As the process of actually constructing models becomes democratized, the _____ is shifting to using the right data and using the data right. In this course, you will gain the ability to correctly represent information from your domain as numeric data and get it into a form where the full capabilities of models can be leveraged. First, you'll learn how outliers and missing data can be dealt with in a perfectly sound manner. Next, you'll discover how to use spreadsheets, programming languages, and relational databases to work with your data. You'll see the different types of data that you may deal with in the real world, and how you can collect and integrate data to a common destination to eliminate silos. Finally, you'll round out the course by working with visualization tools that allow every member of an enterprise to work with data and extract meaningful insights. When you're finished with this course, you will have the skills and knowledge to use the right data sources, cope with the data quality issues, and choose the right technologies to extract insights from your enterprise data.