Assignment in R

3 tiles - sunny sky followed by arrow pointing to a smiley.
Photo by Susanne Jutzeler, suju-foto on Pexels.com

As in other programming languages, data is associated with a name to make it reusable within a program. This operation is typically known as an assignment.

In R syntax, there are several ways to carry out this operation and this is going to be the thrust of this post. Please if after reading you have additional ideas about this topic, kindly share them with us in the comments so that we can learn together.

1. Assignment operators

Again, just like in most programming languages, R provides a set of operators that are used for assignments as follows:

  • The equals sign operator (=)
  • The ‘arrow’ operator (<-)
  • The super-assignment operator (<<-)

The = and <- do essentially the same thing and, to a large extent, can be used interchangeably in R code. So, the following statements give the same result:

x = "A string"
x
## [1] "A string"

y <- "A string"
y
## [1] "A string"

There are, however some subtle differences between the two operators. Due to the rules governing operator precedence in R (see ?Syntax), when the two are used together in a statement, the arrow operator is chosen over the other. Thus although the following lines of code are valid

x <- y <- 3
x = y = 3
x = y <- 3

…this line of code fails with an error

x <- y = 3

This is because when parsing this statement, R will first of all attempt to carry out the arrow assignment. But as we can see, y is not, at this point, bound to any value. To make this work, we would need to introduce parentheses, which will force R to read the parenthesized expression first

x <- (y = 3)
x
## [1] 3

For a better understanding of how this works, John Chambers provided the following explanation way back in 2001:

The development version of R now allows some assignments to be written C- or Java-style, using the = operator. This increases compatibility with S-Plus (as well as with C, Java, and many other languages)

One of the cardinal principles that underpin the development of R is the primacy of extensibility, which is what informed the adoption of = as an assignment operator. The <- operator originally came from APL, and in the early days of S, R’s predecessor language, input devices had a key that entered the <- operator. (Side note: In the modern era, two keystrokes are required to produce the operator, though some tools like ESS and R Studio have provided shortcuts). Chambers had this to say about how the operators are different:

The new assignments are allowed in only two places in the grammar: at the top level (as a complete program or user-typed expression); and when isolated from surrounding logical structure, by braces or an extra pair of parentheses.

This explains how we arrived at the results in the code examples above.

Prefer the ‘<-‘ operator

If you’re new to R, particularly if you’re coming from another language, it’s important to know that although = works for creating variables, the <-operator is preferred by the R community and in fact predates =. Some style guides (like Google’s) discourage the use of = as an assignment operator. Still, it’s important to know about its use because you will still find it in a lot of code in the wild.

Another cool thing about the <- operator is that it can face both directions and give the same results. Thus,

3 -> y -> x

…is valid R code. Whether it is that useful is debatable, but personally I use it a lot whenever I have written code with so many pipes and I want to store the terminal output in a variable (by terminal I mean the last statement). Here’s what I mean:

# demo using pseudocode
data |>
do one thing |>
then do something else |>
and then this |>
followed by that |>
and yet another thing |>
...
...
...
and finally do this -> x

Sometimes, when I’m writing code like this, it helps me to have the newly created variable at the bottom so that it’s close to where I’ll likely use it next. This is only made possible by the bidirectional nature of the <- operator.

Super-assignment

R offers a version of the assignment operator that does super-assignment using the syntax <<-. It is usually used inside of functions and will carry out an assignment in an environment that is a parent to that function environment. It follows these rules:

  1. If the variable already exists in the function’s parent environment, a reassignment is carried out.
  2. If the variable does not exist in the function’s parent environment, it is created in the global environment.

Let’s demonstrate this with some code:

f <- function() {
  x <- 1  # regular assignment 
  
  g <- function() {
    x <<- 5  # super-assignment
  }

  g()  # carries out the assignment 
  x
}

f()
## [1] 5

In the above code, the function g() carried out a super-assignment and is defined and called when f() is run. The internal variable x which was initially bound to 1 has its value changed by the call to g(), and thus f() returns 5. However, as stated above, when the variable does not exist in the parent environment, the super-assignment defaults to the workspace.

f <- function() {
# x <- 1

g <- function() {
x <<- 5
}

g()
# x
}

f()
x
## [1] 5

Here we have commented out all references to x in the function environment of f(). When it is called, we see that x has a value, even though we never defined it at the top level. This was made possible by the super-assignment done by g().

2. The ‘assign()’ function

R gives us a facility for carrying out assignments programmatically via the assign function. It takes a string argument which will be the name of the variable, and the value to be bound to the bake. The function also gives the user the ability to determine which environment to associate the name with. This is its signature:

function(x, value, pos = -1, envir = as.environment(pos), inherits = FALSE, immediate = TRUE) 

Modifying our earlier today construct, we have the following code:

assign("x", 3)
x
## [1] 3

So akin to our earlier code we can use it inside a function

f <- function() {
assign("x", 3, globalenv())
}

f()
x
## [1] 3

We can also reproduce the super-assignment we did earlier:

f <- function() {
  x <- 1  # regular assignment 
  
  g <- function() {
    assign("x", 5, envir = parent.frame())
  }

  g()  # does assignment in specified environment
  x
}

f()
## [1] 5

This function gives finer control as to which environment the assignment will be done in. But as mentioned earlier, with this function, variables can easily be created programmatically. For example

for (i in 1:5)
  assign(paste0("x", i), i)

ls()
[1] "i" "x1" "x2" "x3" "x4" "x5"

x3
[1] 3

Conclusion

We have looked at some of the ways assignments can be done in R, the quirks and subtleties, as well as some useful applications. If you want to contribute your own ideas or experiences, leave a comment. Happy coding!

References

  1. Chambers J. C. (2001). Assignments with the = Operator.
  2. Smith, David (2008). Use = or <- for assignment?
  3. StackOverflow (2009). What are the differences between “=” and “<-” assignment operators?
  4. R help files for ?assignOps, ?Syntax, and ?assign.

One thought on “Assignment in R

  1. Hello There
    The article discusses the different ways to assign data in R programming language, including the equals sign, arrow operator, and super-assignment operator. The author invites readers to share additional ideas on the topic in the comments section.
    Discount Coupons- http://www.KuciaKodes.uk

    Like

Comments