So I have something like this:
data.frame(content = c("a","a","b","b","c","c"),
eje = c("politics","sports","education","sports","health","politics"),
value = c(3,2,1,2,1,1))
And I'd like to group by content and keep the values in eje that has the highest value on value, and to keep both values when it ties.
So on sample I'd stay with:
data.frame(content = c("a","b","c","c"),
eje = c("politics","sports","health","politics"),
value = c(3,2,1,1))`
On SQL I'd do something like RANK OVER PARTITION BY (content, DESC value) and then filter rows with value "1" on the RANK column created.
CodePudding user response:
d = data.frame(content = c("a","a","b","b","c","c"),
eje = c("politics","sports","education","sports","health","politics"),
value = c(3,2,1,2,1,1))
library(dplyr)
d %>%
group_by(content) %>%
slice_max(value)
# # A tibble: 4 × 3
# # Groups: content [3]
# content eje value
# <chr> <chr> <dbl>
# 1 a politics 3
# 2 b sports 2
# 3 c health 1
# 4 c politics 1
CodePudding user response:
data.table option:
library(data.table)
dt <- data.table(df)
dt[dt[, .I[value == max(value)], by=content]$V1]
Output:
content eje value
1: a politics 3
2: b sports 2
3: c health 1
4: c politics 1
