I am making a Sankey diagram with ggalluvial.
Here is my dataset
library(ggsankey)
library(tidyverse)
df <-
mtcars %>%
make_long(cyl, vs, am, gear, carb) %>%
mutate(color = c(rep("red", 80), rep("blue", 80)))
You can obtain a Sankey diagram like this:
df %>%
ggplot(aes(x = x,
next_x = next_x,
node = node,
next_node = next_node,
fill = factor(node),
label = factor(node)))
geom_sankey()
geom_sankey(flow.alpha = .6,
node.color = "gray30")
geom_sankey_label(size = 3, color = "white", fill = "gray40")
scale_fill_viridis_d()
theme_sankey(base_size = 18)
labs(x = NULL)
theme(legend.position = "none",
plot.title = element_text(hjust = .5))
Now, I want to color the flows between the labels by the column color of the df. Is it possible? If not, do you know any other ways to do it in R?
I tried:
df %>%
ggplot(aes(x = x,
next_x = next_x,
node = node,
next_node = next_node,
fill = factor(color),
label = factor(node)))
geom_sankey()
geom_sankey(flow.alpha = .6,
node.color = "gray30")
geom_sankey_label(size = 3, color = "white", fill = "gray40")
scale_fill_viridis_d()
theme_sankey(base_size = 18)
labs(x = NULL)
theme(legend.position = "none",
plot.title = element_text(hjust = .5))
But the plot seems totally broken:
CodePudding user response:
In the end, ggaluvial seems more adapted to my problem:
Here is the data formating:
df <-
mtcars %>%
select(cyl, vs, am, gear, carb) %>%
mutate(color = c(rep("red", nrow(mtcars)/2), rep("blue", nrow(mtcars)/2)),
id = seq(1:nrow(mtcars))) %>%
pivot_longer(cols = !c(color, id),
names_to = "var",
values_to = "state")
And here is the plot with the correct flow colors:
df %>%
ggplot(aes(x = var,
stratum = state,
label = state,
alluvium = id))
stat_alluvium(aes(fill = color),
width = 0,
alpha = 1,
geom = "flow")
geom_stratum(width = 0.2)
geom_text(stat = "stratum", size = 5, angle = 90)
theme_bw()



