Author avatar

Dániel Szabó

Scoping and Closures in R

Dániel Szabó

  • Apr 9, 2020
  • 5 Min read
  • 308 Views
  • Apr 9, 2020
  • 5 Min read
  • 308 Views
Data
Data Analytics
Languages and Libraries
R

Introduction

In this guide, we'll look at two foundational concepts of the R programming language. First we'll clarify what scope is and how it works. There are two main scoping tactics in programming languages, and we will look at what is used in R. After that, the concept of closures will be introduced, and we'll look at examples of both scoping and closures.

Scopes

The scope of a variable is nothing more than the place in the code where it is referenced and visible. There are two basic concepts of scoping, lexical scoping and is dynamic scoping. In R, there is a concept of free variables, which add some spice to the scoping. The values of such variables are searched for in the environment in which the function was defined.

Let's look at an example of free variables.

1
2
3
f <- function(a, b) {
         (a * b) / z
 }
R

In this function, you have two formal arguments, a and b. You have another symbol, z, in the body of the function, which is a free variable. The scoping rules of the language define how value is assigned to free variables. R uses lexical scoping, which says the value for z is searched for in the environment where the function was defined.

Note: Lexical scoping is also referred to as statical scoping.

With dynamic scoping, the variable is bound to the most recent value assigned to that variable. Scoping also introduces another concept called extent. The extent is a specific interval of time during which references may occur throughout the execution. A fun fact: The origin of lexical scoping was in 1960 when John McCarthy first published his original paper on the LISP programming language.

R provides some escape routes to bypass the shortcomings of lexical scoping. The <- operator is called a variable assignment operator. Given the expression a <- 3.14, the value is assigned to the variable in the current environment. If you already had an assignment for the variable before in the same environment, this one will overwrite it. Variable assignments only update in the current environment, and they never create a new scope. When R is looking for a value of a given variable, it will start searching from the bottom. This means the current environment is inspected first, then its enclosing environment. The search goes until either the value is found or the empty environment is reached.

Let's demonstrate lookup.

1
2
3
a <- 3.14
b = function(x,y){ x * y / a}
b(10,11)
R

The output is the following:

1
[1] 35.03185
bash

When the function is called, only the two arguments are passed. R tries to look up the a variable's value and first looks at the scope of the function. Since it cannot be found there, it look for the value in the enclosing scope, where it finally finds it. If you had not defined the a variable, it would give you the following error: Error in b(10, 11) : object 'a' not found, stating that the lookup has failed.

This brings us to the concept of environment. Environments in R are basically mappings from variables to values. Every function has a local environment and a reference to the enclosing environment. This helps scoping and lookup. You have the option to add, remove, or modify variable mappings and can even change the reference to the enclosing environment.

Closures

In R, you have something called first-class functions, which evaluate to closures. The functions body consists of the body of the function and the environment in which the function was evaluated. This opens up the possibility to create functions that change their operation based on the environments they are placed in. After you have programmed a while in R, you get used to passing functions as arguments, and these usually return a result. But there is another aspect of this when you rerun functions. This allows you to create abstraction and reduce problems in complexity and time effort.

Let's say your HR department wants a function that increases people's salaries by 5%.

1
hrfunction_5 <- function(base) { base * 1.05}
R

Later, the department makes another request, this time to create a function that increases salaries by 7%.

1
hrfunction_7 <- function(base) { base * 1.07}
R

The requests keep coming for different salary modifications. You decide to call on the aid of abstraction and do the following:

1
hrfunction <- function(incr){function(base){base * incr}} 
R

From now on you can refer to each percentage the following way:

1
2
3
4
5
hr_5 <- hrfunction(1.05)
hr_7 <- hrfunction(1.07)

hr_5(100)
hr_7(100)
R

The output should be as follows:

1
2
[1] 105
[2] 107
R

You might be wondering how the hr_5 and hr_7 functions know where to look for the incr value. Due to the lexical scoping, each function carries with it a reference to the environment where it was defined. When the call to the hrfunction happens, the incr argument is attached to the environment for the return function.

Conclusion

In this guide, you have learned about two core concepts of R, gaining a deeper understanding of how scoping works and how to turn it to your advantage. I hope this guide has been informative to you and I would like to thank you for reading it.

8