I want to change the format of my data frame. Now its in long format but I want to change it to wide format so that each sample has its own column indicating of the virus is present of absent based in the information in cond. Present should be given 1, absent 0.
In:
virus sample cond
1 virusA A Present
2 virusB A Present
3 virusC A Absent
4 virusA B Absent
5 virusB B Present
6 virusC B Present
df <- structure(list(virus = c("virusA", "virusB", "virusC", "virusA",
"virusB", "virusC"), sample = c("A", "A", "A", "B", "B", "B"),
cond = c("Present", "Present", "Absent", "Absent", "Present",
"Present")), class = "data.frame", row.names = c(NA, -6L))
Out:
> df.out
virus A B
1 virusA 1 0
2 virusB 1 1
3 virusC 0 1
CodePudding user response:
Use pivot_wider with values_fn
library(tidyr)
pivot_wider(df, names_from = sample, values_from = cond,
values_fn = list(cond = ~ sum(. == 'Present')))
-output
# A tibble: 3 × 3
virus A B
<chr> <int> <int>
1 virusA 1 0
2 virusB 1 1
3 virusC 0 1
