I have a dataset with different types of observations across several "transects". Still pretty new to R, and struggling with the below issue...
I need to calculate the number of "nest" observations in each transect, but I am getting an error that makes me think maybe I am not using the correct function? In the end, I want to create a new column called "nest_number" which has the sum of the number of observations equal to nest.
The data is in this format:
| transect | observation |
|---|---|
| 1A | nest |
| 1A | NA |
| 1A | nest |
| 1A | vocalization |
| 1A | NA |
| 2A | nest |
| 2A | NA |
| ... | ... |
Here is how I need the output to look:
| transect | observation | nest_number |
|---|---|---|
| 1A | nest | 2 |
| 1A | NA | 2 |
| 1A | nest | 2 |
| 1A | vocalization | 2 |
| 1A | NA | 2 |
| 2A | nest | 1 |
| 2A | NA | 1 |
| ... | ... | ... |
Here is the code I used
dfNew <- df %>%
group_by(transect) %>%
mutate(number_nests = colSums(observation == "nest", na.rm = TRUE))
The error I get is:
'x' must be an array of at least two dimensions The error occurred in group 1: transect = "1A".
CodePudding user response:
It should be sum and not colSums because colSums expect a data.frame/matrix, but here we are doing the sum on a logical vector (observation == "nest")
library(dplyr)
df %>%
group_by(transect) %>%
mutate(nest_number = sum(observation == "nest", na.rm = TRUE)) %>%
ungroup
-output
# A tibble: 7 × 3
transect observation nest_number
<chr> <chr> <int>
1 1A nest 2
2 1A <NA> 2
3 1A nest 2
4 1A vocalization 2
5 1A <NA> 2
6 2A nest 1
7 2A <NA> 1
