I want to plot only once the text of the mean for the specific cluster.
but what I want is this:
code for reproduction:
price_l <- rep(c('€€-€€€', '€€-€€€', '€€€€', '€€-€€€', '€€-€€€',
'€€-€€€', '€€€€', '€€-€€€', '€€€€', '€€-€€€',
'€€-€€€', '€€-€€€', '€€-€€€', '€€-€€€',
'€€-€€€', '€€-€€€', '€€-€€€', '€€-€€€', '€€€€','€', '€',
'€', '€','€€€€', '€'),100)
avg_r <- rep(c(4.5, 3.5, 4.0, 4.0, 4.0, 3.5, 4.5, 4.0, 3.0, 4.0,
3.0, 5.0, 4.5, 4.0, 3.0,
3.5, 4.5, 3.5, 3.5, 4.0, 3.0, 4.0, 4.0, 2.5, 4.5),100)
sub.df <- data.frame(price_l, avg_r)
sub.df %>%
group_by(price_l) %>%
mutate(mean = mean(avg_r)) %>%
ungroup() %>%
ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T)
geom_jitter(aes(colour = price_l))
geom_text(aes(label = sprintf("%.2f",mean)))
CodePudding user response:
We could use stat_summary(aes(label = ..y..), geom = "text", fun = mean, color="black", size = 6, fontface = 2)
sub.df %>%
group_by(price_l) %>%
mutate(mean = mean(avg_r)) %>%
ungroup() %>%
ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T)
geom_jitter(aes(colour = price_l))
stat_summary(aes(label = ..y..), geom = "text", fun = mean, color="black", size = 6, fontface = 2)
CodePudding user response:
You can set the y value manually inside the geom_text
sub.df %>%
group_by(price_l) %>%
mutate(mean = mean(avg_r)) %>%
ungroup() %>%
ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T)
geom_jitter(aes(colour = price_l))
geom_text(aes(y = 3.5, label = sprintf("%.2f",mean)),
check_overlap = TRUE, size = 6, fontface = 2)
Or, as r2evans suggests:
sub.df %>%
group_by(price_l) %>%
mutate(mean = mean(avg_r)) %>%
ungroup() %>%
ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T)
geom_jitter(aes(colour = price_l))
geom_text(aes(y = mean, label = sprintf("%.2f",mean)),
check_overlap = TRUE, size = 6, fontface = 2)
CodePudding user response:
For what it's worth, here's a way to do this using stat_summary(). This has an advantage over the previous method in that: (1) there's no need to summarize beforehand via group_by()... mutate()... functions, and (2) it avoids overplotting that will occur if you use geom_text().
The answer proposed using geom_text() alone works just fine for the result, but you'll note that this will result in overplotting. The reason is that geom_text() like all other geoms will draw "a thing" on the plot for every observation in the dataset. The dataframe resulting from the pipe (%>%) commands above the initial ggplot() call should have 2500 observations. This means that if you ask geom_text() to create a label/text at a specific position, it will do so... 2500 times.
To avoid this, you should do one of two things:
Create a separate dataframe of aggregated data containing only 3 observations (three pieces of text here) and use
geom_text(data = that_new_dataframe...), orUse
stat_summary()and have that do all the summarizing for you based on the original dataset,sub.df.
For the stat_summary() method, you can create a userfunction to return a label and y value (satisfying the aesthetics required for geom_text() and then apply that to your dataset within stat_summary() via the fun.data= argument:
my_fun <- function(x){
return(data.frame(y=mean(x), label=sprintf("%.2f", mean(x))))
}
sub.df %>%
ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T)
geom_jitter(aes(colour = price_l))
stat_summary(
geom="text", fun.data="my_fun", size=8,
aes(group=price_l)
)
Note: after posting this I realize it's similar to @TarJae's answer... but kept it here due to the further explanation.






