I guess this is a fairly easy one, but I cannot sort it out on my own.
I have the following list of dfs:
df1 <- data.frame(col1=c(1,2,3), col2= c(4,3,6))
df2 <- data.frame(col1=c(5,2,7), col2= c(1,4,8))
df3 <- data.frame(col1=c(4,9,9), col2= c(7,6,4))
list.of.dfs <- list(df1,df2,df3)
Now I want to apply this if-else command on my list of dfs:
for (i in 1:length(list.of.dfs)) {
if (list.of.dfs[[i]]$col2 >= 7) {
list.of.dfs[[i]]$newcol <- "high"
} else if (list.of.dfs[[i]]$col2 >= 5) {
list.of.dfs[[i]]$newcol <- "medium"
} else if (list.of.dfs[[i]]$col2 < 5) {
list.of.dfs[[i]]$newcol <- "low"
}
}
The output I desire would be a new column for each row of my dfs in my list filled with one of the three expressions from my if-else function.
However, it seems like my code only considers the first row in each iteration:
> list.of.dfs[[1]]
col1 col2 newcol
1 1 4 low
2 2 3 low
3 3 6 low
CodePudding user response:
One way is to use tidyverse. I created a custom function inside of purrr::map so that I could apply it to every dataframe in the list. I used case_when to assign the values in the new values in newcol.
library(tidyverse)
map(list.of.dfs, function(x)
x %>%
rowwise %>%
mutate(newcol = case_when(
col2 >= 7 ~ "high",
between(col2, 5, 6) ~ "medium",
col2 < 5 ~ "low"
)))
Output
[[1]]
# A tibble: 3 × 3
# Rowwise:
col1 col2 newcol
<dbl> <dbl> <chr>
1 1 4 low
2 2 3 low
3 3 6 medium
[[2]]
# A tibble: 3 × 3
# Rowwise:
col1 col2 newcol
<dbl> <dbl> <chr>
1 5 1 low
2 2 4 low
3 7 8 high
[[3]]
# A tibble: 3 × 3
# Rowwise:
col1 col2 newcol
<dbl> <dbl> <chr>
1 4 7 high
2 9 6 medium
3 9 4 low
CodePudding user response:
By following your approach with Base R,
for(i in 1:length(list.of.dfs)) {
x <- list.of.dfs[[i]]
list.of.dfs[[i]][,"newcol"] <- ifelse(x[,"col2"]>=7,"high",
ifelse(x[,"col2"]>=5,"medium","low")
)
}
gives,
[[1]]
col1 col2 newcol
1 1 4 low
2 2 3 low
3 3 6 medium
[[2]]
col1 col2 newcol
1 5 1 low
2 2 4 low
3 7 8 high
[[3]]
col1 col2 newcol
1 4 7 high
2 9 6 medium
3 9 4 low
CodePudding user response:
You can use two functions - one to check the col2 value and another wrapper function to apply the first function over the list of dataframes
rating <- function(r){
if (r[2] >= 7){
return("high")
} else if (r[2] < 5){
return("low")
} else {
return("medium")
}
}
rate.df <- function(df){
newcol <- apply(df, 1, rating)
cbind(df, newcol=newcol)
}
list.of.dfs <- lapply(list.of.dfs, rate.df)
This produces an output:
[[1]]
col1 col2 newcol
1 1 4 low
2 2 3 low
3 3 6 medium
[[2]]
col1 col2 newcol
1 5 1 low
2 2 4 low
3 7 8 high
[[3]]
col1 col2 newcol
1 4 7 high
2 9 6 medium
3 9 4 low
