Libraries: If you want this lab, consider one of these libraries.
Data

Querying and Converting Data Types in R Hands-on Practice

This lab teaches querying and converting data types in R, starting with the basics of dataset structures and moving through data manipulation techniques. It provides practical examples for effective data querying and filtering. The course concludes with essential resources for further learning in data analysis.

Get started Contact sales

Lab Info

Level

Intermediate

Last updated

Aug 22, 2025

Duration

54m

Challenge

Exploring and Managing Data with RStudio
RStudio Guide

To get started, click on the 'workspace' folder in the bottom right pane of RStudio. Click on the file entitled "Step 1...". You may want to drag the console pane to be smaller so that you have more room to work. You'll complete each task for Step 1 in that R Markdown file. Remember, you must run the cells with the play button at the top right of each cell for a task before moving onto the next task in the R Markdown file. Continue until you have completed all tasks in this step. Then when you are ready to move onto the next step, you'll come back and click on the file for the next step until you have completed all tasks in all steps of the lab.

Exploring and Managing Data with RStudio

To review the concepts covered in this step, please refer to the Understanding Dataset Structures and Formats module of the Querying and Converting Data Types in R course.

Understanding Dataset Structures and Formats is important because it lays the foundation for all data analysis tasks in R. This step focuses on practical skills like exploring datasets, using the RStudio interface effectively, and managing data types and packages, which are crucial for any aspiring R programmer.

Start by loading the dataset (mtcars) available in R. Use the summary() function to get an overview of the dataset. Explore the dataset by attaching it and using basic R commands to query and filter data. Convert a data frame to a tibble and a data table, observing the differences in output and performance. This exercise will help you become familiar with RStudio's interface and the basic data manipulation tasks in R.

Task 1.1: Loading and Summarizing the Dataset

Start by loading the mtcars dataset, which is built into R. Use the summary() function to get a quick overview of the dataset. This will help you understand the structure and the type of data it contains.

🔍 Hint

You can access the mtcars dataset directly since it's built into R. Use the summary() function by passing the dataset name as an argument.
🔑 Solution

# Load the mtcars dataset mtcars # Use the summary function to get an overview of the dataset summary(mtcars)
Task 1.2: Exploring the Dataset

Attach the mtcars dataset. Find the cars with an mpg (miles per gallon) greater than 20.

🔍 Hint

Use the attach() function to make the mtcars dataset's columns directly accessible. Then, use the dataset column mpg directly in a conditional statement to filter the data.
🔑 Solution

# Attach the mtcars dataset attach(mtcars) # Find cars with mpg greater than 20 mtcars[mpg > 20, ]
Task 1.3: Converting Data Frame to Tibble

Convert the mtcars data frame to a tibble named mtcars_tibble. Print mtcars_tibble and observe the differences in output. Tibbles are a modern take on data frames, but with some added conveniences.

🔍 Hint

Use the as_tibble() function from the tibble package to convert mtcars into a tibble. Make sure to load the tibble package using library(tibble) before converting.
🔑 Solution

# Load the tibble package library(tibble) # Convert mtcars to a tibble mtcars_tibble <- as_tibble(mtcars) # Observe the output print(mtcars_tibble)
Task 1.4: Converting Data Frame to Data Table

Now, convert the mtcars data frame to a data table named mtcars_data_table and observe how the output differs from a regular data frame and a tibble.

🔍 Hint

Use the as.data.table() function from the data.table package to convert mtcars into a data table. Remember to load the data.table package using library(data.table) before converting.
🔑 Solution

# Load the data.table package library(data.table) # Convert mtcars to a data table mtcars_data_table <- as.data.table(mtcars) # Observe the output print(mtcars_data_table)
Challenge

Data Type Conversion and List Management
Data Type Conversion and List Management

To review the concepts covered in this step, please refer to the Selecting and Converting Data Types module of the Querying and Converting Data Types in R course.

Selecting and converting data types is important because the ability to manipulate and convert data types is essential for data preprocessing and analysis in R. This step will focus on converting between numeric, integer, character, and factor variables, as well as managing lists, which are common tasks in data science projects.

This exercise will enhance your understanding of R's data types and how to manipulate them.

Task 2.1: Create a Mixed Data Vector

Create a vector named mixed_data containing a mix of numeric, integer, character, and logical values. Include the numeric value 42.2, the integer 3, the character string R is fun, and the boolean TRUE.

🔍 Hint

Use the c() function to combine values of different types. Remember to use quotes for character values and TRUE or FALSE for logical values.
🔑 Solution

mixed_data <- c(42.2, 3L, 'R is fun', TRUE)
Task 2.2: Convert Mixed Data to Numeric

Convert the mixed_data vector to numeric type and assign it to a new variable numeric_data. Print the new variable. Note that non-numeric values will be coerced to NA.

🔍 Hint

Use the as.numeric() function and pass mixed_data as the argument.
🔑 Solution

numeric_data <- as.numeric(mixed_data) print(numeric_data)
Task 2.3: Convert Mixed Data to Integer

Convert the mixed_data vector to integer type and assign it to a new variable integer_data. Print the new variable. Non-numeric values will be coerced to NA, while float values will lose information after the decimal.

🔍 Hint

Use the as.integer() function and pass mixed_data as the argument.
🔑 Solution

integer_data <- as.integer(mixed_data) print(integer_data)
Task 2.4: Convert Mixed Data to Character

Convert the mixed_data vector to character type and assign it to a new variable character_data. Print the new variable.

🔍 Hint

Use the as.character() function and pass mixed_data as the argument.
🔑 Solution

character_data <- as.character(mixed_data) print(character_data)
Task 2.5: Factor to Character Conversion

Create a factor variable named factor_var with values 'high', 'low', 'high', 'medium', and levels 'low', 'medium', and 'high'. Print factor_var. Then, convert this factor to a character variable named char_var and print it.

🔍 Hint

To create a factor variable, use the factor() function. The first argument should be the values, and the second argument should specify the levels. Use the as.character() function and pass factor_var as the argument to convert it to a character vector.
🔑 Solution

factor_var <- factor(c('high', 'low', 'high', 'medium'), levels=c('low', 'medium', 'high')) print(factor_var) char_var <- as.character(factor_var) print(char_var)
Task 2.6: Logical to Numeric Conversion

Create a logical vector named logical_vec with values TRUE, FALSE, and TRUE. Convert this logical vector to a numeric vector named numeric_vec and print it.

🔍 Hint

Use the as.numeric() function and pass logical_vec as the argument.
🔑 Solution

logical_vec <- c(TRUE, FALSE, TRUE) numeric_vec <- as.numeric(logical_vec) print(numeric_vec)
Task 2.7: Convert Character to POSIXct and POSIXlt

Given the provided character vector char_dates with dates in the format 'YYYY-MM-DD HH:MM:SS', convert to to POSIXct and POSIXlt formats, assigning them to posixct_dates and posixlt_dates respectively.

🔍 Hint

Use the as.POSIXct() function for converting to POSIXct and as.POSIXlt() for POSIXlt. The format string should be '%Y-%m-%d %H:%M:%S'.
🔑 Solution

char_dates <- c('2023-04-01 12:00:00', '2023-04-02 15:30:00') posixct_dates <- as.POSIXct(char_dates, format = '%Y-%m-%d %H:%M:%S') posixlt_dates <- as.POSIXlt(char_dates, format = '%Y-%m-%d %H:%M:%S')
Task 2.8: Create and Explore a List

Create a list named mixed_list containing elements of different data types. Each element should be named. Include the numeric value 42.2, the integer 3, the character R is fun, the boolean TRUE, and the vector posixct_dates from the prior task. Then, explore the structure of this list using the str() function.

🔍 Hint

Use the list() function to combine elements of different types. To explore the list, use the str() function with mixed_list as the argument.
🔑 Solution

mixed_list <- list(numeric_value=42.2, integer_value=3L, character_value='R is fun', boolean_value=TRUE, vector_value=posixct_dates) str(mixed_list)
Challenge

Advanced Data Querying and Filtering
Advanced Data Querying and Filtering

To review the concepts covered in this step, please refer to the Querying and Filtering Data module of the Querying and Converting Data Types in R course.

Querying and Filtering Data is important because these are fundamental skills for extracting insights from data. This step will cover advanced querying and filtering techniques using data frames, data tables, and tibbles, which are essential for any data analysis project in R.

Practice querying a data frame using box brackets and logical tests. Explore the use of the subset() function to filter data frames based on specific criteria. Move on to querying and filtering a data table, using advanced techniques like the %in% operator. Experiment with the dplyr package to perform queries on a tibble, using functions like filter(), arrange(), select(), and mutate(). This step will help you master the art of data querying and filtering in R, using a variety of data structures.

Task 3.1: Querying a Data Frame with Logical Tests

Load the mtcars dataset, one of the default datasets that comes with R. Use logical tests within box brackets [] to query the data frame. Extract rows where the mpg column is greater than 25.

🔍 Hint
Use the `data()` function to load `mtcars`. Use the syntax data_frame_name[condition, ] to perform the query. The condition should be a logical test applied to one of the columns, like data_frame$mpg > 25.
🔑 Solution

data(mtcars) mtcars[mtcars$mpg > 25, ]
Task 3.2: Filtering Data Frames with the subset() Function

Utilize the subset() function to filter rows from a data frame based on a specific condition. Filter out rows where the hp (horsepower) is less than 100.

🔍 Hint
The subset() function syntax is subset(x, subset), where x is the data frame and subset is the condition. For example, subset(df, hp < 100).
🔑 Solution

subset(mtcars, hp < 100)
Task 3.3: Querying a Data Table with Advanced Operators

Load the data.table package. Convert mtcars to a data.table named mtcars_dt. Query the data table using the %in% operator and order() function. Extract rows where the cyl (number of cylinders) is either 4 or 6, and order them by wt (weight) in descending order.

🔍 Hint
To use the %in% operator, apply it within the i argument of the data table syntax DT[i, j, by]. For ordering, use the order() function within the j argument, and set it to `-column_name` for descending order.
🔑 Solution

library('data.table') mtcars_dt <- as.data.table(mtcars) mtcars_dt[cyl %in% c(4, 6), .SD, .SDcols = c('cyl', 'wt')][order(-wt)]
Task 3.4: Performing Queries on a Tibble with dplyr

Use the dplyr package to perform complex queries on a tibble. Convert the mtcars data frame to a tibble called mtcars_tbl. First, filter rows where mpg is greater than 25. Second, arrange them by carb (number of carburetors) in ascending order. Third, select only the carb and mpg columns. Finally, add a new column mpg_plus_one that is mpg + 1.

🔍 Hint
Chain the functions using the %>% operator. Start with `filter()` to apply the mpg condition, then use `arrange()` to sort by carb, followed by `select()` to pick columns, and finally `mutate()` to add the new column.
🔑 Solution

library('dplyr') mtcars_tbl <- as_tibble(mtcars) mtcars_tbl %>% filter(mpg > 25) %>% arrange(carb) %>% select(carb, mpg) %>% mutate(mpg_plus_one = mpg + 1)
Challenge

Leveraging the Tidyverse for Data Preprocessing
Leveraging the Tidyverse for Data Preprocessing

To review the concepts covered in this step, please refer to the Course Summary and Further Resources module of the Querying and Converting Data Types in R course.

Course Summary and Further Resources is important because it consolidates the learning and introduces powerful tools for data preprocessing. This step emphasizes the use of the Tidyverse suite of packages for efficient data manipulation, which is a critical skill in data science.

Load the tidyverse package, which comes pre-installed in this environment. Practice using readr to import data and tidyr for data cleaning tasks. Explore the use of purrr for applying functions across elements in a list or vector. Use dplyr to perform data manipulation tasks such as selecting, renaming, summarizing, and mutating data. This exercise will familiarize you with the Tidyverse ecosystem, enhancing your data preprocessing capabilities in R.

Task 4.1: Loading the tidyverse Package

Begin by loading the tidyverse package to access its suite of tools for data manipulation. This is the first step in utilizing the powerful features of the tidyverse for data preprocessing.

🔍 Hint

Use the library() function and specify the name of the package you want to load, which is tidyverse.
🔑 Solution

library('tidyverse')
Task 4.2: Importing Data with readr

Use the readr package, part of the Tidyverse, to import the CSV file named sample_data.csv into R. Store the imported data in a variable named data for further manipulation.

🔍 Hint

You don't need to load readr explicitly, as it is already loaded by tidyverse. Assign the result of read_csv() function to a variable. The function takes the file name as its argument, which in this case is sample_data.csv.
🔑 Solution

data <- read_csv('sample_data.csv')
Task 4.3: Cleaning Data with tidyr

With the tidyr package, part of the Tidyverse, practice cleaning the imported data. Specifically, use the drop_na() function to remove any rows with missing values from the data dataframe.

🔍 Hint

Use the drop_na() function on the data variable to remove rows with NA values. Assign the result back to a variable.
🔑 Solution

data <- drop_na(data)
Task 4.4: Applying Functions with purrr

Utilize the purrr package, part of the Tidyverse, to apply a function that doubles the values of the numbers column. Store the result in a new variable named doubled_numbers and print that variable.

🔍 Hint

Use the map_dbl() function from purrr. The first argument is the vector numbers, and the second argument is a formula that specifies the function to apply, in this case, doubling the values.
🔑 Solution

doubled_numbers <- map_dbl(data$numbers, ~ .x * 2) print(doubled_numbers)
Task 4.5: Data Manipulation with dplyr

Using the dplyr package, part of the Tidyverse, perform a series of data manipulation tasks on the data dataframe. First, select only the columns id and value. Then, rename the column value to measurement. Finally, add a new column measurement_sq that contains the square of measurement. Print the new version of data.

🔍 Hint

Chain the operations using the %>% operator. Use select() to choose columns, rename() to change column names, and mutate() to add new columns. Replace placeholders with the correct column names and operations.
🔑 Solution

data <- data %>% select(id, value) %>% rename(measurement = value) %>% mutate(measurement_sq = measurement^2) print(data)

About the author

Real skill practice before real-world application

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Learn by doing

Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.

Follow your guide

All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.

Turn time into mastery

On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.

Querying and Converting Data Types in R Hands-on Practice

Lab Info

Table of Contents

Exploring and Managing Data with RStudio

RStudio Guide

Exploring and Managing Data with RStudio

Task 1.1: Loading and Summarizing the Dataset

Task 1.2: Exploring the Dataset

Task 1.3: Converting Data Frame to Tibble

Task 1.4: Converting Data Frame to Data Table

Data Type Conversion and List Management

Data Type Conversion and List Management

Task 2.1: Create a Mixed Data Vector

Task 2.2: Convert Mixed Data to Numeric

Task 2.3: Convert Mixed Data to Integer

Task 2.4: Convert Mixed Data to Character

Task 2.5: Factor to Character Conversion

Task 2.6: Logical to Numeric Conversion

Task 2.7: Convert Character to POSIXct and POSIXlt

Task 2.8: Create and Explore a List

Advanced Data Querying and Filtering

Advanced Data Querying and Filtering

Task 3.1: Querying a Data Frame with Logical Tests

Task 3.2: Filtering Data Frames with the subset() Function

Task 3.3: Querying a Data Table with Advanced Operators

Task 3.4: Performing Queries on a Tibble with dplyr

Leveraging the Tidyverse for Data Preprocessing

Leveraging the Tidyverse for Data Preprocessing

Task 4.1: Loading the tidyverse Package

Task 4.2: Importing Data with readr

Task 4.3: Cleaning Data with tidyr

Task 4.4: Applying Functions with purrr

Task 4.5: Data Manipulation with dplyr

About the author

Real skill practice before real-world application

Learn by doing

Follow your guide

Turn time into mastery

Get started with Pluralsight