If you run:
mod <- lm(mpg ~ factor(cyl), data=mtcars)
It runs, because lm knows to look in mtcars to find both mpg and cyl.
Yet mean(mpg)
fails as it can't find mpg, so you do mean(mtcars$mpg)
.
How do you code a function so that it knows to look in 'data' for the variables?
myfun <- function (a,b,data){
return(a+b)
}
This will work with:
myfun(mtcars$mpg, mtcars$hp)
but will fail with:
myfun(mpg,hp, data=mtcars )
Cheers
Answer
Here's how I would code myfun()
:
myfun <- function(a, b, data) {
eval(substitute(a + b), envir=data, enclos=parent.frame())
}
myfun(mpg, hp, mtcars)
# [1] 131.0 131.0 115.8 131.4 193.7 123.1 259.3 86.4 117.8 142.2 140.8 196.4
# [13] 197.3 195.2 215.4 225.4 244.7 98.4 82.4 98.9 118.5 165.5 165.2 258.3
# [25] 194.2 93.3 117.0 143.4 279.8 194.7 350.0 130.4
If you're familiar with with()
, it's interesting to see that it works in almost exactly the same way:
> with.default
# function (data, expr, ...)
# eval(substitute(expr), data, enclos = parent.frame())
#
#
In both cases, the key idea is to first create an expression from the symbols passed in as arguments and then evaluate that expression using data
as the 'environment' of the evaluation.
The first part (e.g. turning a + b
into the expression mpg + hp
) is possible thanks to substitute()
. The second part is possible because eval()
was beautifully designed, such that it can take a data.frame
as its evaluation environment.
No comments:
Post a Comment