Home > OS >  Apply if else function over list of dfs
Apply if else function over list of dfs

Time:01-15

I guess this is a fairly easy one, but I cannot sort it out on my own.

I have the following list of dfs:

df1 <- data.frame(col1=c(1,2,3), col2= c(4,3,6))
df2 <- data.frame(col1=c(5,2,7), col2= c(1,4,8))
df3 <- data.frame(col1=c(4,9,9), col2= c(7,6,4))
list.of.dfs <- list(df1,df2,df3)

Now I want to apply this if-else command on my list of dfs:

for (i in 1:length(list.of.dfs)) {
  if (list.of.dfs[[i]]$col2 >= 7) {
    list.of.dfs[[i]]$newcol <- "high"
  } else if (list.of.dfs[[i]]$col2 >= 5) {
    list.of.dfs[[i]]$newcol <- "medium"
  } else if (list.of.dfs[[i]]$col2 < 5) {
    list.of.dfs[[i]]$newcol <- "low"
  }
}

The output I desire would be a new column for each row of my dfs in my list filled with one of the three expressions from my if-else function.

However, it seems like my code only considers the first row in each iteration:

> list.of.dfs[[1]]
  col1 col2 newcol
1    1    4    low
2    2    3    low
3    3    6    low

CodePudding user response:

One way is to use tidyverse. I created a custom function inside of purrr::map so that I could apply it to every dataframe in the list. I used case_when to assign the values in the new values in newcol.

library(tidyverse)

map(list.of.dfs, function(x)
  x %>%
    rowwise %>%
    mutate(newcol = case_when(
      col2 >= 7 ~ "high",
      between(col2, 5, 6) ~ "medium",
      col2 < 5 ~ "low"
    )))

Output

[[1]]
# A tibble: 3 × 3
# Rowwise: 
   col1  col2 newcol
  <dbl> <dbl> <chr> 
1     1     4 low   
2     2     3 low   
3     3     6 medium

[[2]]
# A tibble: 3 × 3
# Rowwise: 
   col1  col2 newcol
  <dbl> <dbl> <chr> 
1     5     1 low   
2     2     4 low   
3     7     8 high  

[[3]]
# A tibble: 3 × 3
# Rowwise: 
   col1  col2 newcol
  <dbl> <dbl> <chr> 
1     4     7 high  
2     9     6 medium
3     9     4 low   

CodePudding user response:

By following your approach with Base R,

for(i in 1:length(list.of.dfs)) {
    x <- list.of.dfs[[i]]
    list.of.dfs[[i]][,"newcol"] <- ifelse(x[,"col2"]>=7,"high",
        ifelse(x[,"col2"]>=5,"medium","low")
    )
}

gives,

[[1]]
  col1 col2 newcol
1    1    4    low
2    2    3    low
3    3    6 medium

[[2]]
  col1 col2 newcol
1    5    1    low
2    2    4    low
3    7    8   high

[[3]]
  col1 col2 newcol
1    4    7   high
2    9    6 medium
3    9    4    low

CodePudding user response:

You can use two functions - one to check the col2 value and another wrapper function to apply the first function over the list of dataframes

rating <- function(r){
  if (r[2] >= 7){
    return("high")
  } else if (r[2] < 5){
    return("low")
  } else {
    return("medium")
  }
}

rate.df <- function(df){
  newcol <- apply(df, 1, rating)
  cbind(df, newcol=newcol)
}

list.of.dfs <- lapply(list.of.dfs, rate.df)

This produces an output:

[[1]]
  col1 col2 newcol
1    1    4    low
2    2    3    low
3    3    6 medium

[[2]]
  col1 col2 newcol
1    5    1    low
2    2    4    low
3    7    8   high

[[3]]
  col1 col2 newcol
1    4    7   high
2    9    6 medium
3    9    4    low
  •  Tags:  
  • Related