Home > database >  make unique rows and append min value of some columns
make unique rows and append min value of some columns

Time:01-29

I have a data frame on R. I would like to get the unique rows based on the first three columns and also append the min value of the 4th column in each unique row.

dat <- tibble(
  x = c("a", "a", "k", "k"),
  y = c("a", "a", "l", "l"),
  z = c("e", "e", "m" ,"m"),
  t = c("4", "3", "8" ,"9"))

What I would like to see is below.

x y z t
a a e 3
k l m 8

I believe there is a very easy way to do that but I can not see it at that moment.

CodePudding user response:

With tidyverse, use group_by with summarise

library(dplyr)
dat %>%
  group_by(across(x:z)) %>% 
  summarise(t = min(t), .groups = 'drop')

-output

# A tibble: 2 × 4
  x     y     z     t    
  <chr> <chr> <chr> <chr>
1 a     a     e     3    
2 k     l     m     8    

Or do an arrange and use distinct

dat %>%
  arrange(across(everything())) %>%
  distinct(across(x:z), .keep_all = TRUE)
# A tibble: 2 × 4
  x     y     z     t    
  <chr> <chr> <chr> <chr>
1 a     a     e     3    
2 k     l     m     8    

CodePudding user response:

We may call to apply() to find the unique rows values per row in dat. Then, we can used duplicated() to look for duplicates and use the negation ! to return rows that are not duplicates. We use which to obtain integers corresponding to the rows in dat that are not duplicates. Finally, use these integers (unique_rows) to extract the unique rows from dat. As such, we do not have to append.

unique_rows <- which(!duplicated(apply(dat[, 1:3], 1, unique)))
out <- dat[unique_rows, ]

Output

> out
  x y z t
1 a a e 4
3 k l m 8

CodePudding user response:

Another way to deal with this would be to take minimum value of t column and keep remaining columns as group in aggregate function.

aggregate(t~., dat, min)

#  x y z t
#1 a a e 3
#2 k l m 8
  •  Tags:  
  • Related