I'm trying to pivot to a longer format using dplyr::pivot_longer, but can't seem to get it to do what I want. I can manage with reshape::melt, but I'd also like to be able to achieve the same using pivot_longer.
The data I'm trying to reformat is a correlation matrix of the mtcars-dataset:
# Load packages
library(reshape2)
library(dplyr)
# Get the correlation matrix
mydata <- mtcars[, c(1,3,4,5,6,7)]
cormat <- round(cor(mydata),2)
head(cormat)
mpg disp hp drat wt qsec
mpg 1.00 -0.85 -0.78 0.68 -0.87 0.42
disp -0.85 1.00 0.79 -0.71 0.89 -0.43
hp -0.78 0.79 1.00 -0.45 0.66 -0.71
drat 0.68 -0.71 -0.45 1.00 -0.71 0.09
wt -0.87 0.89 0.66 -0.71 1.00 -0.17
qsec 0.42 -0.43 -0.71 0.09 -0.17 1.00
Then, I want only to filter out the upper triangle of the matrix;
#Get upper triangle of the correlation matrix
cormat[upper.tri(cormat)] <- NA #OR upper.tri function
And then reshape it into a long format:
# Reshape into a long format
melted_cormat <-
cormat %>%
melt(na.rm=TRUE)
head(melted_cormat)
Var1 Var2 value value_2
1 mpg mpg 1.00 1
7 mpg disp -0.85 -0.85
8 disp disp 1.00 1
13 mpg hp -0.78 -0.78
14 disp hp 0.79 0.79
15 hp hp 1.00 1
Finally, what the figure that I'm making is:
ggplot(data = melted_cormat, aes(Var2, Var1, fill = value))
geom_tile(color="white")
scale_fill_gradient2(low = "blue", high = "red", mid = "white",
midpoint = 0, limit = c(-1,1),
#space = "Lab",
name="Spearman\nCorrelation")
theme_minimal()
coord_fixed()
geom_text(aes(Var2, Var1, label = value), color = "black", size = 4)
theme(
axis.text.x=element_text(family="Calibri", face="plain", color="black", size=12, angle=0),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
panel.grid.major=element_blank(),
panel.border=element_blank(),
panel.background=element_blank(),
axis.ticks = element_blank(),
legend.justification = c(1, 0),
legend.position = c(0.9, 0.3),
legend.direction = "horizontal")
guides(fill = guide_colorbar(barwidth = 7, barheight = 1,
title.position = "top", title.hjust = 0.5))
I cant seem to figure out a way to use pivot_longer in stead of reshape so that it still correctly makes the figure. The following almost works (thanks to @geoff), the dataset seems to be correct but the figure does not:
melted_cormat <-
cormat %>%
as_tibble() %>%
mutate(Var1 = colnames(cormat)) %>%
pivot_longer(names_to = "Var2", values_to = "value", mpg:qsec, values_drop_na=TRUE)
running the same ggplot code as above gives:
CodePudding user response:
Does this achieve the behavior you need?
cormat |>
as_tibble() |>
mutate(Var1 = rownames(cormat)) |>
pivot_longer(names_to = "Var2", values_to = "val", mpg:qsec)
Output:
# A tibble: 36 x 3
Var1 Var2 val
<chr> <chr> <dbl>
1 mpg mpg 1
2 mpg disp -0.85
3 mpg hp -0.78
4 mpg drat 0.68
5 mpg wt -0.87
6 mpg qsec 0.42
7 disp mpg -0.85
8 disp disp 1
9 disp hp 0.79
10 disp drat -0.71
# ... with 26 more rows
CodePudding user response:
The nearly automated outputs from corrplot::corrplot() might get you what you want with very litle effort. Check out the vingette for more info.
library(tidyverse)
library(corrplot)
mtcars[, c(1,3,4,5,6,7)] %>%
cor(.) %>%
round(2) %>%
corrplot(method = "color",
type = "upper",
addCoef.col = "black",
col = colorRampPalette(c("red", "white", "blue"))(200),
tl.col = 'black')

Created on 2022-01-12 by the reprex package (v2.0.1)
CodePudding user response:
Compare str(melted_cormat) with str(pivoted_cormat). You'll find that the older reshape2::melt() converts strings to factors whereas tidyr::pivot_longer() leaves them strings.
The consequence of this is that with the melted version, ggplot() will order the rows and columns according to the factor levels and thus preserve the original order in cormat, but in the second case where they're just plain strings, they just go in alphabetical order.
To fix this, simply mutate() Var1 and Var2 into factor using as levels the original order of the columns in cormat. This will give you the plot you wanted.
Observe the difference in the last two lines of the example below and also note that the default method of cor is "pearson" so just be careful when you label your legend with the correlation method.
# Load packages
library(tidyverse)
library(reshape2)
# define plotting function
plot_fun <- function(dat) {
ggplot(data = dat, aes(Var2, Var1, fill = value))
geom_tile(color = "white")
scale_fill_gradient2(
low = "blue",
high = "red",
mid = "white",
midpoint = 0,
limit = c(-1, 1),
#space = "Lab",
name = "Spearman\nCorrelation"
)
theme_minimal()
coord_fixed()
geom_text(aes(Var2, Var1, label = value),
color = "black",
size = 4)
theme(
axis.text.x = element_text(
family = "Calibri",
face = "plain",
color = "black",
size = 12,
angle = 0
),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.grid.major = element_blank(),
panel.border = element_blank(),
panel.background = element_blank(),
axis.ticks = element_blank(),
legend.justification = c(1, 0),
legend.position = c(0.9, 0.3),
legend.direction = "horizontal"
)
guides(fill = guide_colorbar(
barwidth = 7,
barheight = 1,
title.position = "top",
title.hjust = 0.5
))
}
# Get the correlation matrix
cormat <- mtcars[, c(1, 3, 4, 5, 6, 7)] %>%
cor(., method = "spearman") %>% # note selection of correlation method
round(2) %>%
replace(upper.tri(.), NA)
# make melted version
melted <- cormat %>%
melt(na.rm = TRUE)
# make pivoted version
pivoted <-
cormat %>%
as.data.frame() %>%
rownames_to_column("Var1") %>%
pivot_longer(
-Var1,
names_to = "Var2",
values_to = "value",
values_drop_na = TRUE
)
# note column types on melted vs pivoted
str(melted)
#> 'data.frame': 21 obs. of 3 variables:
#> $ Var1 : Factor w/ 6 levels "mpg","disp","hp",..: 1 2 3 4 5 6 2 3 4 5 ...
#> $ Var2 : Factor w/ 6 levels "mpg","disp","hp",..: 1 1 1 1 1 1 2 2 2 2 ...
#> $ value: num 1 -0.91 -0.89 0.65 -0.89 0.47 1 0.85 -0.68 0.9 ...
str(pivoted)
#> tibble [21 x 3] (S3: tbl_df/tbl/data.frame)
#> $ Var1 : chr [1:21] "mpg" "disp" "disp" "hp" ...
#> $ Var2 : chr [1:21] "mpg" "mpg" "disp" "mpg" ...
#> $ value: num [1:21] 1 -0.91 1 -0.89 0.85 1 0.65 -0.68 -0.52 1 ...
# melted version gives desired plot
melted %>%
plot_fun()

# pivoted version orders variables in alphabetical order
pivoted %>%
plot_fun()

# turning the variable names into a factor fixes the plot
pivoted %>%
mutate(across(starts_with("Var"), ~factor(.x, levels = colnames(cormat)))) %>%
plot_fun()

Created on 2022-01-12 by the reprex package (v2.0.1)


