My code is as below:
data = data.frame(x1 = c(1,1,1,1)
,x2 = c(0,1,0,1)
,x3 = c(1,1,0,1),x4 = c(1,1,0,0)) %>% rowSums
data%>%
case_when(. == 0 ~ 0,
. %in% c(1,2)~ 1,
. %in% c(3:5)~ 2)
The sample data is as below:
| x1 | x2 | x3 | x4 |
|---|---|---|---|
| 1 | 0 | 1 | 1 |
| 1 | 1 | 1 | 1 |
| 1 | 0 | 0 | 0 |
| 1 | 1 | 1 | 0 |
where x1,x2,x3,x4 are in one data frame and they are binary variables.
Then, the rowsums of x1,x2,x3,x4 are calculated.
The result is as below:
| rowsums |
|---|
| 3 |
| 4 |
| 1 |
| 3 |
I would like to use case_when to do classification, however, when I run the above code, the error:
! Case 1 (.) must be a two-sided formula, not a double vector. also appears and I cannot solve it by using different method...
CodePudding user response:
The pipe inserts the left-hand expression as the first argument into the right-hand side call.
That is, your call is equivalent to:
case_when(data,
data == 0 ~ 0,
data %in% c(1,2) ~ 1,
data %in% c(3:5) ~ 2)
To prevent this, surround the right-hand side with {…}:
data %>% {
case_when(. == 0 ~ 0,
. %in% c(1,2) ~ 1,
. %in% c(3:5) ~ 2)
}
The documentation gives the following description:
For example,
iris %>% subset(1:nrow(.) %% 2 == 0)is equivalent toiris %>% subset(., 1:nrow(.) %% 2 == 0)but slightly more compact. It is possible to overrule this behavior by enclosing therhsin braces. For example,1:10 %>% {c(min(.), max(.))}is equivalent toc(min(1:10), max(1:10)).
