Friday 27 December 2019

ggplot2 - Error using sapply and split to have different pvalues and r^2's on a facet wrap ggplot

I'm attempting to have different pvalues and r^2's show up on a plot I'm making using ggplot. My plot needs to be faceted, as I have many different factors of data I'm working on. The graphs I'm trying to make should all be linear models, but I'd like each to have it's own pvalue and r^2 show up in it's respective space.



I've been trying to use sapply to split the dataframe up and then calculate the r^2's and pvalues and then plug them back into the plot using geom_text(label = examplefunction), but I keep receiving the error "Error: Aesthetics must be either length 1 or the same as the data (244): x, y, label, hjust, vjust".



Here's an example using the "tips" dataframe from the reshape package:



library(reshape)


lm_equation <- function(tips){
sapply(split(tips, list(tips$sex, tips$day)), function(tips){
m <- lm(tips$tip ~ tips$total_bill, tips);
eq <- substitute(~~italic(r)^2~"="~rvalue*","~italic(p)~"="~pvalue,
list(rvalue = sprintf("%.2f",sign(coef(m)[2])*sqrt(summary(m)$r.squared)),
pvalue = format(summary(m)$coefficients[2,4], digits = 2)))
as.character(as.expression(eq));
})
}


scat <- ggplot(tips, aes(tip, total_bill))
scat +
geom_point(size = 5, alpha = 0.9)+
labs(x = "tip", y = "bill total")+
geom_smooth(method=lm, colour = "#000000", se = F)+
facet_grid(sex~day, scales = "free")+
geom_text(x = min(tips$tip), y = max(tips$total_bill-10), label = lm_equation(tips), parse = T, vjust = "inward", hjust = "inward")+
theme_classic() +
theme(text = element_text(size = 15))



What's frustrating is the code works if I take out the split, but then the pvalues and r^2s are meaningless since they are taken from the entire dataframe rather than just that specific faceted group.



Example of working code:



lm_equation2 <- function(tips){
m <- lm(tips$tip ~ tips$total_bill, tips);
eq <- substitute(~~italic(r)^2~"="~rvalue*","~italic(p)~"="~pvalue,
list(rvalue = sprintf("%.2f",sign(coef(m)[2])*sqrt(summary(m)$r.squared)),

pvalue = format(summary(m)$coefficients[2,4], digits = 2)))
as.character(as.expression(eq));
}

scat2 <- ggplot(tips, aes(tip, total_bill))
scat2 +
geom_point(size = 5, alpha = 0.9)+
labs(x = "tip", y = "bill total")+
geom_smooth(method=lm, colour = "#000000", se = F)+
facet_grid(sex~day, scales = "free")+

geom_text(x = min(tips$tip), y = max(tips$total_bill-10), label = lm_equation2(tips), parse = T, vjust = "inward", hjust = "inward")+
theme_classic() +
theme(text = element_text(size = 15))


What am I missing here? Do I need to resort to subsetting my data?

No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print &q...