Home > Software engineering >  Summing data frames with non-numeric values
Summing data frames with non-numeric values

Time:01-27

I would like to sum dataframes that contain non-numeric arguments. When I do the simple add function I receive the following error: "Error in FUN(left, right) : non-numeric argument to binary operator".

How can I solve this?

ex1:

`04:00` `04:10` `04:20`
  <chr>   <chr>   <chr>  
1 a       0       a      
2 0       a       a      
3 0       0       a 

ex2:

`04:00` `04:10` `04:20`
  <chr>   <chr>     <dbl>
1 0       b             0
2 b       0             0
3 b       b             0

Desired outcome:

`04:00` `04:10` `04:20`
      <chr>   <chr>     <dbl>
    1 a       b             a
    2 b       a             a
    3 b       b             a

Sample code:

sum = ex1   ex2

Sample data:

ex1<-structure(list(`04:00` = c("a", "0", "0"), `04:10` = c("0", "a", 
"0"), `04:20` = c("a", "a", "a")), spec = structure(list(cols = list(
    `04:00` = structure(list(), class = c("collector_character", 
    "collector")), `04:10` = structure(list(), class = c("collector_character", 
    "collector")), `04:20` = structure(list(), class = c("collector_character", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), delim = ","), class = "col_spec"),  row.names = c(NA, 
-3L), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"))
    

ex2<-structure(list(`04:00` = c("0", "b", "b"), `04:10` = c("b", "0", 
"b"), `04:20` = c(0, 0, 0)), spec = structure(list(cols = list(
    `04:00` = structure(list(), class = c("collector_character", 
    "collector")), `04:10` = structure(list(), class = c("collector_character", 
    "collector")), `04:20` = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), delim = ","), class = "col_spec"),  row.names = c(NA, 
-3L), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"))

CodePudding user response:

For the test case this works:

data.frame(
  Map(\(x, y) ifelse(x == "0", y, x), 
      ex1, ex2), 
  check.names = FALSE)
#  04:00 04:10 04:20
#1     a     b     a
#2     b     a     a
#3     b     b     a

This doesn't check if column names match. It simply iterates over the columns of both data.frames simultaneously.

CodePudding user response:

You could first set your zeros to NAs, and then use dplyr::coalesce :

l <- map(list(ex1,ex2), ~ mutate(.x, across(.fns = ~ replace(., 0, NA_character_))))
do.call(coalesce, l)

# A tibble: 3 x 3
  `04:00` `04:10` `04:20`
1 a       b       a      
2 b       a       a      
3 b       b       a      
  •  Tags:  
  • Related