Data Management and Preparation Using R

Data management and data preparation is a very important yet widely overlooked part of data analysis. Importing, selecting a proper class, cleaning, and filtering are all part of data preparation and will be taught in this course.
Course info
Level
Beginner
Updated
Sep 18, 2017
Duration
1h 60m
Table of contents
Description
Course info
Level
Beginner
Updated
Sep 18, 2017
Duration
1h 60m
Description

Have you ever encountered problems in data analysis just because the data was not clean, had a wrong format, or was simply messy? Data preparation is an immensely important yet overlooked field in data science. Most of the time of a data professional is not spent analyzing or visualizing, it is spent getting data ready as clean and well-structured as possible. R is a widely used open source tool with an active user community. This community created high quality add on packages for data preparation. In this course, Data Management and Preparation Using R, you will not only learn about data preparation in R Base, you will also learn about those add on packages that make R so powerful. First, you'll learn about data importing, cleaning, and structuring (selecting the right class). Next, you'll explore data querying. Finally, you will learn about dplyr, tidyr, reshape2 and data.table. At the end of this course, you will be able to select the right tools and efficiently perform data import, formatting, cleaning, and querying.

About the author
About the author

Martin is a trained biostatistician, programmer, consultant and data science enthusiast. His main objective: Explaining data science in a straightforward way. You can find his latest work over at: r-tutorials.com

More from the author
Querying and Converting Data Types in R
Beginner
2h 6m
Aug 15, 2019
Mining Data from Time Series
Advanced
2h 59m
Jun 3, 2019
More courses by Martin Burger
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi guys, this is Martin Burger, and I welcome you to my course, Data Management and Preparation Using R. As a data scientist and biostatistician, I know first-hand how important clean and well formatted data is. R is widely used for data analytics. It offers great data management and data preparation tools. In this course, you will learn techniques to solve the most common problems of the data preparation steps. This is basically the first step in the whole data analytics process, which means, you cannot escape this. No matter your industry or analytics approach, you have to prepare your data first. In the course, you will see how to use different import tools to get standard as well as exotic file formats into R. You will learn which object classes are best suited towards your data sets. You will use the tidyr add on package to clean and format your data, and you will use the data. table package, as well as standard tools in order to filter or query even large data sets. By the end of this course, you will be able to select suitable add on packages and use the best functions for data preparation. I would categorize this course as a beginners plus course. If you're familiar with basic R code, you will be able to fully benefit from this course. Alright guys, I hope you will enjoy this course, I'll see you inside.