- Lab
- Data

Understanding Statistical Models and Mathematical Models Hands-on Practice
In this lab, you will learn to apply mathematical and statistical models using R, starting with basic concepts and progressing to real-world applications. You will then model phenomena like population growth and financial risk, solve complex problems such as the 8-queens puzzle, and perform hypothesis testing using T-tests and Z-tests.

Path Info
Table of Contents
-
Challenge
Exploring Data and Metadata in R
RStudio Guide
To get started, click on the 'workspace' folder in the bottom right pane of RStudio. Click on the file entitled "Step 1...". You may want to drag the console pane to be smaller so that you have more room to work. You'll complete each task for Step 1 in that R Markdown file. Remember, you must run the cells with the play button at the top right of each cell for a task before moving onto the next task in the R Markdown file. Continue until you have completed all tasks in this step. Then when you are ready to move onto the next step, you'll come back and click on the file for the next step until you have completed all tasks in all steps of the lab.
Exploring Data and Metadata in R
To review the concepts covered in this step, please refer to the Understanding Statistical and Mathematical Models module of the Understanding Statistical Models and Mathematical Models course.
Understanding the structure and context of your data is crucial for effective data analysis and model selection. This step will help you practice interpreting data with the context provided by metadata, a foundational skill in both statistical and mathematical modeling.
Dive into the world of data analysis by exploring the
HRIS.csv
dataset available in the lab environment. Your goal is to understand the structure of the dataset and the context provided by its metadata. Use thestr()
function to examine the structure of the dataset, highlighting the types of data and any potential challenges they may present. Then, leverage thecomment()
function to add and query metadata for the dataset, providing context that could influence your analysis and model selection. This hands-on experience will solidify your understanding of how metadata can impact data interpretation and model applicability.
Task 1.1: Loading and Exploring the Dataset
Begin your journey into data analysis by loading the
HRIS.csv
dataset into R. Use theread.csv()
function to load the dataset and assign it to a variable. Then, use thestr()
function to examine the structure of the dataset, focusing on the types of data it contains and any potential challenges these data types may present for analysis.π Hint
Use
read.csv('HRIS.csv')
to load the dataset into a variable namedhris_data
. Then, callstr(hris_data)
to display its structure.π Solution
# Load the HRIS.csv dataset hris_data <- read.csv('HRIS.csv') # Examine the structure of the dataset str(hris_data)
Task 1.2: Adding Metadata to the Dataset
Now that you have a basic understanding of the dataset's structure, it's time to add context to it by adding metadata. Use the
comment()
function to add a brief description to thehris_data
dataset. This description should provide context about what the dataset represents and any relevant information that could influence analysis or model selection.π Hint
Use
comment(hris_data) <- 'your description here'
to add a description to the dataset.π Solution
comment(hris_data) <- 'This dataset contains HR information including employee details, positions, and salaries.'
Task 1.3: Querying Metadata
With metadata added to your dataset, explore how you can query this information to better understand the context of your data. Use the
comment()
function to retrieve the description you added to thehris_data
dataset.π Hint
To retrieve the description you added to the dataset, simply use
comment(hris_data)
.π Solution
my_comment <- comment(hris_data) print(my_comment)
-
Challenge
Modeling Population Growth with ODEs
Modeling Population Growth with ODEs
To review the concepts covered in this step, please refer to the Case Studies on Statistical and Mathematical Models module of the Understanding Statistical Models and Mathematical Models course.
Ordinary Differential Equations (ODEs) are a powerful tool for modeling deterministic systems, such as population growth. This step will enhance your ability to apply mathematical models to real-world problems, focusing on the deterministic nature of such models.
In this task, you'll apply your knowledge of Ordinary Differential Equations (ODEs) to model population growth. Using the
deSolve
package in R, set up and solve an ODE that models population growth based on the Verhulst's Decreasing Growth Model. You'll need to define the initial population size, the rate of growth, and the carrying capacity of the environment. Visualize the solution using R's plotting functions to understand how the population evolves over time. This exercise will deepen your understanding of how mathematical models can be used to predict deterministic systems.
Task 2.1: Loading the deSolve Package
Before we can start modeling population growth with ODEs, we need to load the
deSolve
package which provides functions to solve initial value problems for differential equations. This package is essential for our task.π Hint
Use the
library
function to load thedeSolve
package.π Solution
library('deSolve')
Task 2.2: Defining the Population Growth Model
Define a function
pop_growth
that represents the Verhulst's Decreasing Growth Model. This function will take three parameters: timet
, statey
, and parametersparms
. Theparms
argument in this case will include a list specifying the rate of growthr
and the carrying capacityK
. The function should return the rate of change of the population size.π Hint
The rate of change of the population size can be calculated using the formula
dY = r * y * (1 - y / K)
, wherer
is the rate of growth,y
is the current population size, andK
is the carrying capacity. When retrieving items from theparms
list, make sure to use the double bracket syntax[[]]
.π Solution
pop_growth <- function(t, y, parms) { r <- parms[['r']] K <- parms[['K']] dY <- r * y * (1 - y / K) list(c(dY)) }
Task 2.3: Setting Initial Conditions and Parameters
Create variables representing the initial conditions and parameters for the population growth model. Start with an initial population size of 100, a growth rate of 0.1, and a carrying capacity of 1000. Define a sequence of times from 1 to 100.
π Hint
Assign the specified numbers to variables with appropriate names corresponding to the formula. Use the
seq()
function to create a sequence of numbers.π Solution
# Initial population size y0 <- 100 # Rate of growth r <- 0.1 # Carrying capacity K <- 1000 # Sequence of times times <- seq(1, 100, by = 1)
Task 2.4: Solving the ODE
Use the
ode
function from thedeSolve
package to solve the ODE for the population growth model. Specify the initial state, time sequence, and parameters. Store the result in a variable namedsolution
.π Hint
The
ode
function requires the initial state (y
), the time sequence (times
), the model function (func
), and the parameters (parms
), which should include ther
andK
parameters.π Solution
solution <- ode(y = y0, times = times, func = pop_growth, parms = list(r = r, K = K))
Task 2.5: Visualizing the Population Growth
Visualize the solution of the population growth model using R's plotting functions. Plot the population size over time to understand how the population evolves.
π Hint
Use the
plot
function withtype = 'l'
to create a line plot. The x-axis should represent time, and the y-axis should represent the population size.π Solution
plot(solution[,1], solution[,2], type = 'l', xlab='Time', ylab='Population')
-
Challenge
Solving the 8 Queens Problem with Local Search Optimization
Solving the 8 Queens Problem with Local Search Optimization
To review the concepts covered in this step, please refer to the Applying Mathematical Models in R module of the Understanding Statistical Models and Mathematical Models course.
The 8 Queens problem is a classic example of a combinatorial optimization problem that can be solved using local search techniques. The idea is to position 8 Queens on an 8x8 chess board in such away that they cannot attack one another. This step will give you hands-on experience with optimization techniques, enhancing your problem-solving skills in mathematical modeling.
Tackle the 8 Queens problem using local search optimization techniques in R. Start by setting up an initial state of the chessboard with 8 queens placed along the first rank. Implement and apply simulated annealing, stochastic local search, and threshold accepting algorithms to find a solution where no two queens attack each other. Use helper functions to generate candidate solutions and visualize the board's state during the optimization process. This practical exercise will help you understand the application of local search techniques in solving complex optimization problems.
Task 3.1: Load Packages and Convenience Functions
Load the
NMOF
R package, which comes pre-installed in this environment. Then, load the provided convenience functions into the environment.π Hint
Use the
library()
function to load a package. Then, simply run the provided code to store the convenience functions in the environment.π Solution
# Load the package library(NMOF) # Load the provided convenience functions print_board <- function(position, q.char="1", sep = " ") { n <- length(position) row <- rep("*", n) for (i in seq_len(n)) { row_i <- row row_i[position[i]] <- q.char cat(paste(row_i, collapse = sep)) cat("\n") } } neighbor <- function(position, board_size=8) { step <- 2 i <- sample.int(board_size, 1) position[i] <- position[i] + sample(c(1:step, -(1:step)), 1) if (position[i] > board_size) position[i] <- 1 else if (position[i] < 1) position[i] <- board_size return(position) } n_attacks <- function(position) { sum(duplicated(position)) + sum(duplicated(position - seq_along(position))) + sum(duplicated(position + seq_along(position))) }
---
Task 3.2: Solve the N-Queens Problem using Simulated Annealing
The provided code initializes the position of the queens along the first column. Use simulated annealing to solve the N-Queens problem on an 8x8 board.
π Hint
Use the
SAOpt()
function from theNMOF
package for simulated annealing. Seek to minimizen_attacks
and use theneighbor
function as the neighbourhood function to return a changed solution at each step.π Solution
# Provided code to initialize a board pos0 <- rep(1, 8) # Solve the N-Queens problem solution1 <- SAopt(n_attacks, list(x0 = pos0, neighbour = neighbor, printBar = TRUE, nS = 1000)) print_board(solution1$xbest)
Task 3.3: Solve the N-Queens Problem using Stocastic Local Search
Use Stocastic Local Search to solve the N-Queens problem on an 8x8 board.
π Hint
Use the
LSopt()
function from theNMOF
package for simulated annealing. Seek to minimizen_attacks
and use theneighbor
function as the neighbourhood function to return a changed solution at each step.π Solution
solution2 <- LSopt(n_attacks, list(x0 = pos0, neighbour = neighbor, printBar = TRUE, nS = 1000)) print_board(solution2$xbest)
Task 3.4: Solve the N-Queens Problem using the Threshold Accepting Method
Use Threshold Accepting to solve the N-Queens problem on an 8x8 board.
π Hint
Use the
TAopt()
function from theNMOF
package for simulated annealing. Seek to minimizen_attacks
and use theneighbor
function as the neighbourhood function to return a changed solution at each step.π Solution
solution3 <- TAopt(n_attacks, list(x0 = pos0, neighbour = neighbor, printBar = TRUE, nS = 1000)) print_board(solution3$xbest)
-
Challenge
Performing Hypothesis Testing in R
Performing Hypothesis Testing in R
To review the concepts covered in this step, please refer to the Applying Statistical Models in R module of the Understanding Statistical Models and Mathematical Models course.
Hypothesis testing is a fundamental concept in statistical modeling, enabling you to make inferences about populations based on sample data. This step will provide practice in setting up and executing hypothesis tests, a critical skill in statistical analysis.
Practice hypothesis testing using the
HRIS.csv
dataset. Use thet.test()
function in R to perform a two-sample t-test to compare the average salaries of male and female employees. Interpret the results, focusing on the p-value and the test statistic, to determine if there is a significant difference in salaries. This exercise will reinforce your understanding of hypothesis testing and its application in analyzing real-world data.
Task 4.1: Load the HRIS Dataset
Begin by loading the
HRIS.csv
dataset into R. Read the file into R as a data frame namedhris_data
. Display the first few rows of the dataset.π Hint
Use the
read.csv()
function with the file path as its argument to load the dataset. Then, use thehead()
function on your dataset variable to display its contents.π Solution
# Load the HRIS dataset hris_data <- read.csv('HRIS.csv') # Display the first few rows of the dataset head(hris_data)
Task 4.2: Perform a Two-Sample t-Test
Perform a two-sample t-test comparing the average salaries of male and female employees. Store the result in a variable named
salary_test
. Finally, print the result to interpret the p-value and the test statistic.π Hint
Use the
t.test()
function. Subset bygender
and select thesalary
column to create anx
andy
variable to pass tot.test()
.π Solution
# Perform a two-sample t-test comparing the average salaries of male and female employees male_salary <- hris_data[hris_data$gender == 'Male', "salary"] female_salary <- hris_data[hris_data$gender == 'Female', "salary"] salary_test <- t.test(male_salary, female_salary, data = hris_data) # Print the result print(salary_test) # ___ have a higher average salary, and the difference is ___ # Answer: Males, not stastically significant
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the authorβs guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.