I am using dplyr for most of my data wrangling in R. Yet, I am having a hard time achieving this particular effect. Can't also seem to find the answer by googling either.
Assume I have data like this and what I want to achieve is to sort person-grouped data based on cash value from the year 2021. Below I show the outcome I wish to achieve. I am just missing my imagination on this one I guess. If I only had 2021 value I could simply use ... %>% arrange(desc(cash)) but I am not sure how to follow from here.
year person cash
0 2020 personone 29
1 2021 personone 40
2 2020 persontwo 17
3 2021 persontwo 13
4 2020 personthree 62
5 2021 personthree 55
And what I want to achieve is to sort this data in descending order based on values from the year 2021. So that the data should look like:
year person cash
0 2020 personthree 62
1 2021 personthree 55
2 2020 personone 29
3 2021 personone 40
4 2020 persontwo 17
5 2021 persontwo 13
CodePudding user response:
One approach using a join:
df %>%
filter(year == 2021) %>%
# group_by(person) %>% slice(2) %>% ungroup() %>% #each person's yr2
arrange(-cash) %>%
select(-cash, -year) %>%
left_join(df)
Output:
person year cash
1 personthree 2020 62
2 personthree 2021 55
3 personone 2020 29
4 personone 2021 40
5 persontwo 2020 17
6 persontwo 2021 13
CodePudding user response:
Another option:
library(dplyr)
dat %>%
group_by(person) %>%
mutate(maxcash = max(cash)) %>%
arrange(desc(maxcash)) %>%
ungroup()
# # A tibble: 6 x 4
# year person cash maxcash
# <int> <chr> <int> <int>
# 1 2020 personthree 62 62
# 2 2021 personthree 55 62
# 3 2020 personone 29 40
# 4 2021 personone 40 40
# 5 2020 persontwo 17 17
# 6 2021 persontwo 13 17
Or a one-liner, using base R as a helper:
dat %>%
arrange(-ave(cash, person, FUN = max))
# year person cash
# 4 2020 personthree 62
# 5 2021 personthree 55
# 0 2020 personone 29
# 1 2021 personone 40
# 2 2020 persontwo 17
# 3 2021 persontwo 13
Edit:
If instead of max you mean "always 2021's data", then:
dat %>%
group_by(person) %>%
mutate(cash2021 = cash[year == 2021]) %>%
arrange(desc(cash2021)) %>%
ungroup()
# # A tibble: 6 x 4
# year person cash cash2021
# <int> <chr> <int> <int>
# 1 2020 personthree 62 55
# 2 2021 personthree 55 55
# 3 2020 personone 29 40
# 4 2021 personone 40 40
# 5 2020 persontwo 17 13
# 6 2021 persontwo 13 13
