My problem is kind of simple, but I'm not finding the right solution. Got a dataframe like this:
ID name var1 var2 var3
1 a 1 -1 2
2 b 2 3 2
3 c 1 -1 -1
And I need to get the sum from var1 to var3 of each number that is higher than zero in a var_total variable, like this:
ID name var1 var2 var3 var_total
1 a 1 -1 2 3
2 b 2 3 2 7
3 c 1 -1 -1 1
I managed to get the inconditional sum, like this:
df %>% rowwise %>% mutate(var_total = sum(c_across(starts_with('var'))))
I know there's the na.rm option, so I thought I maybe could temporarily transform the negative values into NAs, but I'm not sure if that's the right approach and if there's an easy way to get back the original numbers.
Thanks!
CodePudding user response:
Using c_across and rowwise -
library(dplyr)
df %>%
rowwise() %>%
mutate(var_total = {
x <- c_across(starts_with('var'))
sum(x[x > 0])
})
But a vectorised base R option would be -
cols <- grep('var', names(df))
df$var_total <- rowSums(df[cols] * (df[cols] > 0))
df
# ID name var1 var2 var3 var_total
#1 1 a 1 -1 2 3
#2 2 b 2 3 2 7
#3 3 c 1 -1 -1 1
CodePudding user response:
Here is a base R one-liner,
rowSums(replace(df, df < 0, 0)[-c(1, 2)])
#[1] 3 7 1
