Introduction

3

The Monte Carlo method is a type of algorithm that relies on random sampling from various distributions to estimate the probability or distribution of a specific outcome. It is suitable when other approaches are difficult or impossible to use, such as sensitivity analysis, option pricing, financial risk measurement, and risk management applications.

In this guide, you will learn how to use the built-in R functions to run Monte Carlo simulations. You will set up a simulation and plot the simulation run results. Then you will generate summary statistics to make it easier to understand the distribution of outcomes. An intermediate understanding of the R programming language is assumed knowledge for this guide.

The Monte Carlo method is performed by repeatedly running a model on a simulated outcome based on varying inputs; the inputs are uncertain and variable. A common but powerful strategy for modelling uncertainty is to randomly sample values from a probability distribution. This allows you to create thousands of input sets for your model. In this way, you can run thousands of permutations of your model, which has several benefits:

- Your output is a large set of results. This means that you have a probability of outcomes rather than simply a single point estimate.
- Monte Carlo generates a distribution of simulated outcomes. This makes it easy to graph and communicate findings.
- It is easy to change the assumptions of the models by varying the distribution type or properties of the inputs.
- You can easily model correlation between input variables.

Monte Carlo simulations are made easy in the R programming language since there are built-in functions to randomly sample from various probability distributions.
The stats package prefixes these functions with `r`

to represent random sampling. Some examples of sampling from these distributions are demonstrated in the code snippet below:

`1 2 3 4 5 6 7 8 9 10 11`

`# sample from an uniform distribution stats::runif(1, min = 0, max = 1) # sample from an exponential distribution stats::rexp(1) # sample from a normal distribution stats::rnorm(1) # sample from a log normal distribution stats::rlnorm(1)`

r

Note that the `stats::`

qualified namespace is used to clarify the source of these functions. However, this is not strictly necessary.

One situation where Monte Carlo is appropriate is when you need to represent a sequence of decisions that are influenced by outside stochastic risk factors. In the following concrete example, you will model an asset allocation problem where you decide what portion of wealth should be allocated to risk-free investment or high-risk investment at multiple discrete time periods. In this simulation, the returns from the previous period contribute to the returns of the next period. This means that a single-point model is inappropriate.

In this example, there are two sources of uncertainty:

- The uncertain return of the risky asset
- How much to allocate to each type of investment

The below code snippet shows a simple function that calculates returns based on different asset allocations.

`1 2 3 4 5`

`calculate_return <- function(alpha) { risk_free_rate <- 1.03 risky_rate <- rnorm(1) * 0.05 + 1 (1 - alpha) * risk_free_rate + alpha * risky_rate }`

r

`alpha`

is an interaction variable with a range of 1 and 0 that determines how much wealth should be allocated to each asset class in each discrete time period.
`risky_free_rate`

is a fixed yield that doesn't change between periods. `risky_rate`

is a random continuous variable that is centered on `1.05`

to represent uncertainty.

Now that we have a model set up, we can begin running it.

The code below executes 1,000 runs of the model over twelve discrete time periods.

`1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16`

`install.packages('tidyverse') library(tidyverse) RUNS <- 1000 DECISION.STEPS <- 12 simulations <- rerun(RUNS, replicate(DECISION.STEPS, runif(1) %>% calculate_return())) %>% set_names(paste0("sim", 1:RUNS)) %>% map(~ accumulate(., ~ .x * .y)) %>% map_dfr(~ tibble(value = .x, step = 1:DECISION.STEPS), .id = "simulation") simulations %>% ggplot(aes(x = step, y = value)) + geom_line(aes(color = simulation)) + theme(legend.position = "none") + ggtitle("Simulations of returns from asset allocation")`

r

Note that this code uses functional programming features offered by the purrr package, which you can read more about in Explore R Libraries: Purrr.

When you plot the simulation outputs using `ggplot2`

, you will see a distribution of outcomes.
In this case, each line represents the predicted return on investment based on different series of inputs.

To make the output data easier to understand, you can summarize the data.
For example, you can compute the `min`

, `max`

, and `mean`

of your simulation runs across the time steps.
To do this, run the following code:

`1 2 3 4 5 6 7 8 9`

`summary_values <- simulations %>% group_by(step) %>% summarise(mean_return = mean(value), max_return = max(value), min_return = min(value)) %>% gather("series", "value", -step) summary_values %>% ggplot(aes(x = step, y = value)) + geom_line(aes(color = series)) + ggtitle("Mean values from simulations")`

In the model above, you used `rnorm`

to assume a normal distribution of returns on risky investments.
You can change this assumption by changing this function to a different type of distribution as discussed earlier in this guide.
You also coded the risk-free rate to be `1.03`

and the risky rate of return to be centered around `1.05`

.
Using the Monte Carlo method, you can easily change these variables and see what impact this will have on the distribution of returns.

The guide has demonstrated a simple use of the Monte Carlo Simulation in R. However, remember that this kind of model will not provide useful results if the input model is flawed. For this reason, analysts require a combination of mathematical, financial, and programming knowledge. You can learn more by reading Handbook in Monte Carlo Simulation: Applications in Financial Engineering, Risk Management, and Economics.

3