I have a dataframe full of answers to a survey, so each column is filled with Never, Sometimes and Always and I need to change Never to the numeric 0, sometimes to the numeric 1 and always to the numeric 2. Is there a way to apply this change to the whole dataframe instead of individual columns?
CodePudding user response:
Suppose your data frame looks like this:
df
#> Q1 Q2 Q3
#> 1 Never Always Always
#> 2 Always Never Never
#> 3 Never Never Never
#> 4 Sometimes Never Never
#> 5 Never Sometimes Never
#> 6 Always Sometimes Sometimes
#> 7 Always Sometimes Never
#> 8 Sometimes Sometimes Never
#> 9 Sometimes Always Sometimes
#> 10 Always Never Sometimes
Then you can do
df[] <- sapply(df, function(x) match(x, c("Never", "Sometimes", "Always")) - 1)
Which results in
df
#> Q1 Q2 Q3
#> 1 0 2 2
#> 2 2 0 0
#> 3 0 0 0
#> 4 1 0 0
#> 5 0 1 0
#> 6 2 1 1
#> 7 2 1 0
#> 8 1 1 0
#> 9 1 2 1
#> 10 2 0 1
Reproducible data frame
set.seed(1)
df <- replicate(3, sample(c("Never", "Sometimes", "Always"), 10, TRUE))
df <- setNames(as.data.frame(df), c("Q1", "Q2", "Q3"))
CodePudding user response:
You could convert to factor and then to numeric (using Allan Cameron's sample data):
df[] <- sapply(df, function(x) as.numeric(factor(x, levels = c("Never", "Sometimes", "Always"))) - 1)
df %>%
mutate(total = Q1 Q2 Q3)
Q1 Q2 Q3 total
1 0 2 2 4
2 2 0 0 2
3 0 0 0 0
4 1 0 0 1
5 0 1 0 1
6 2 1 1 4
7 2 1 0 3
8 1 1 0 2
9 1 2 1 4
10 2 0 1 3
CodePudding user response:
Another approach could be using a named vector, probably more appropriate if you want more flexible in your translations.
set.seed(1)
df <- replicate(3, sample(c("Never", "Sometimes", "Always"), 10, TRUE))
df <- setNames(as.data.frame(df, stringsAsFactors = F), c("Q1", "Q2", "Q3"))
t <- c(0:2)
names(t) <- c("Never", "Sometimes", "Always")
as.data.frame(lapply(df, function(x) t[x]))
# Q1 Q2 Q3
# 1 0 2 2
# 2 2 0 0
# 3 0 0 0
# 4 1 0 0
# 5 0 1 0
# 6 2 1 1
# 7 2 1 0
# 8 1 1 0
# 9 1 2 1
# 10 2 0 1
CodePudding user response:
Since no one is using a tidyverse approach, I'll add one here.
Use across(everything()) to include all columns in the dataframe.
case_when() allows you to manually specify conditions and values.
The sample data is also from Allan Cameron.
set.seed(1)
df <- replicate(3, sample(c("Never", "Sometimes", "Always"), 10, TRUE))
df <- setNames(as.data.frame(df, stringsAsFactors = F), c("Q1", "Q2", "Q3"))
df <- as_tibble(df)
# A tibble: 10 x 3
Q1 Q2 Q3
<chr> <chr> <chr>
1 Never Never Always
2 Sometimes Never Never
3 Sometimes Always Sometimes
4 Always Sometimes Never
5 Never Always Never
6 Always Sometimes Sometimes
7 Always Always Never
8 Sometimes Always Sometimes
9 Sometimes Sometimes Always
10 Never Always Sometimes
df %>% mutate(across(
everything(),
~ case_when(.x == "Always" ~ 2L,
.x == "Sometimes" ~ 1L,
.x == "Never" ~ 0L)
))
# A tibble: 10 x 3
Q1 Q2 Q3
<int> <int> <int>
1 0 0 2
2 1 0 0
3 1 2 1
4 2 1 0
5 0 2 0
6 2 1 1
7 2 2 0
8 1 2 1
9 1 1 2
10 0 2 1
