I have the following occurrence data

I am trying to sum the number of 1's for each year to get the number of 1's for each year. I then want to repeat this for the number of 0's each year and calculate the proportion of 1's for each year, log-transform this and plot the transformed proportion against year.
I have tried using a for loop for summing the number of 1's (which I called present)
present = matrix(numeric(length = length(Rabbit_10$Occurrence)))
for (i in 1:nrow(Rabbit_10$Year)) {
if (Rabbit_10[Rabbit_10$Occurrence == 1]){
present = sum(Rabbit_10[Rabbit_10$Occurrence == 1])
}
}
present
but I got the error message:
Error in 1:nrow(Rabbit_10$Year) : argument of length 0.
I have tried to Google possible solutions but nothing worked.
CodePudding user response:
creating dataset with binary presence,absence
df1 <- data.frame(
occurence = sample(0:1, 100, replace = TRUE),
year = sample(1770:1780, 100, replace = TRUE))
splitting data
df1_list <- df1 %>%
group_by(year, occurence) %>% # grouping by occurence and year
summarise(freq_group = n()) %>% # creating a frequency for 0 and 1 for each year
ungroup() %>% # ungrouping
group_split(year) # splitting dataset by year into a list
lapplying to each dataframe for all years
df1_list <- lapply(df1_list, function(x) {
x %>%
add_count() %>% # adding count of how many rows exist per group
mutate(prop = case_when(
n == 2 & occurence == 1 ~ log(freq_group / sum(freq_group)), # in cases where n is 2 (indicating two rows) and occurence is 1
n == 1 & occurence == 1 ~ log(freq_group / sum(freq_group)), # in cases where n is 1 and occurence is 1
n == 1 & occurence == 0 ~ log(0) # in cases where n is 1 and occurence is 0 it returns log(0) -inf
))})
plotting
plot1 <- bind_rows(df1_list) %>% # binding data
filter(!is.na(prop)) %>% # filtering out rows where prop is NA
ggplot(data = .) # plotting
geom_point(aes(x = year, y = prop))
CodePudding user response:
Using the same data as @Dasr, you can also do this all in one go with ggplot
df1 <- data.frame(
occurence = sample(0:1, 100, replace = TRUE),
year = sample(1770:1780, 100, replace = TRUE))
ggplot(df1, aes(x=year, y=occurence))
stat_summary(aes(group=year),
geom="point",
fun=function(x)log(mean(x, na.rm=TRUE)))
