What are the differences between the assignment operators =
and <-
in R?
I know that operators are slightly different, as this example shows
x <- y <- 5
x = y = 5
x = y <- 5
x <- y = 5
# Error in (x <- y) = 5 : could not find function "<-<-"
But is this the only difference?
Answer
What are the differences between the assignment operators
=
and<-
in R?
As your example shows, =
and <-
have slightly different operator precedence (which determines the order of evaluation when they are mixed in the same expression). In fact, ?Syntax
in R gives the following operator precedence table, from highest to lowest:
…
‘-> ->>’ rightwards assignment
‘<- <<-’ assignment (right to left)
‘=’ assignment (right to left)
…
But is this the only difference?
Since you were asking about the assignment operators: yes, that is the only difference. However, you would be forgiven for believing otherwise. Even the R documentation of ?assignOps
claims that there are more differences:
The operator
<-
can be used anywhere,
whereas the operator=
is only allowed at the top level (e.g.,
in the complete expression typed at the command prompt) or as one
of the subexpressions in a braced list of expressions.
Let’s not put too fine a point on it: the R documentation is (subtly) wrong [1]. This is easy to show: we just need to find a counter-example of the =
operator that isn’t (a) at the top level, nor (b) a subexpression in a braced list of expressions (i.e. {…; …}
). — Without further ado:
x
# Error: object 'x' not found
sum((x = 1), 2)
# [1] 3
x
# [1] 1
Clearly we’ve performed an assignment, using =
, outside of contexts (a) and (b). So, why has the documentation of a core R language feature been wrong for decades?
It’s because in R’s syntax the symbol =
has two distinct meanings that get routinely conflated:
- The first meaning is as an assignment operator. This is all we’ve talked about so far.
- The second meaning isn’t an operator but rather a syntax token that signals named argument passing in a function call. Unlike the
=
operator it performs no action at runtime, it merely changes the way an expression is parsed.
Let’s see.
In any piece of code of the general form …
‹function_name›(‹argname› = ‹value›, …)
‹function_name›(‹args›, ‹argname› = ‹value›, …)
… the =
is the token that defines named argument passing: it is not the assignment operator. Furthermore, =
is entirely forbidden in some syntactic contexts:
if (‹var› = ‹value›) …
while (‹var› = ‹value›) …
for (‹var› = ‹value› in ‹value2›) …
for (‹var1› in ‹var2› = ‹value›) …
Any of these will raise an error “unexpected '=' in ‹bla›”.
In any other context, =
refers to the assignment operator call. In particular, merely putting parentheses around the subexpression makes any of the above (a) valid, and (b) an assignment. For instance, the following performs assignment:
median((x = 1 : 10))
But also:
if (! (nf = length(from))) return()
Now you might object that such code is atrocious (and you may be right). But I took this code from the base::file.copy
function (replacing <-
with =
) — it’s a pervasive pattern in much of the core R codebase.
The original explanation by John Chambers, which the the R documentation is probably based on, actually explains this correctly:
[
=
assignment is] allowed in only two places in the grammar: at the top level (as a complete program or user-typed expression); and when isolated from surrounding logical structure, by braces or an extra pair of parentheses.
A confession: I lied earlier. There is one additional difference between the =
and <-
operators: they call distinct functions. By default these functions do the same thing but you can override either of them separately to change the behaviour. By contrast, <-
and ->
(left-to-right assignment), though syntactically distinct, always call the same function. Overriding one also overrides the other. Knowing this is rarely practical but it can be used for some fun shenanigans.
No comments:
Post a Comment