make unique rows and append min value of some columns-CodePudding

I have a data frame on R. I would like to get the unique rows based on the first three columns and also append the min value of the 4th column in each unique row.

dat <- tibble(
  x = c("a", "a", "k", "k"),
  y = c("a", "a", "l", "l"),
  z = c("e", "e", "m" ,"m"),
  t = c("4", "3", "8" ,"9"))

What I would like to see is below.

x	y	z	t
a	a	e	3
k	l	m	8

I believe there is a very easy way to do that but I can not see it at that moment.

CodePudding user response：

With tidyverse, use group_by with summarise

library(dplyr)
dat %>%
  group_by(across(x:z)) %>% 
  summarise(t = min(t), .groups = 'drop')

-output

# A tibble: 2 × 4
  x     y     z     t    
  <chr> <chr> <chr> <chr>
1 a     a     e     3    
2 k     l     m     8

Or do an arrange and use distinct

dat %>%
  arrange(across(everything())) %>%
  distinct(across(x:z), .keep_all = TRUE)
# A tibble: 2 × 4
  x     y     z     t    
  <chr> <chr> <chr> <chr>
1 a     a     e     3    
2 k     l     m     8

CodePudding user response：

We may call to apply() to find the unique rows values per row in dat. Then, we can used duplicated() to look for duplicates and use the negation ! to return rows that are not duplicates. We use which to obtain integers corresponding to the rows in dat that are not duplicates. Finally, use these integers (unique_rows) to extract the unique rows from dat. As such, we do not have to append.

unique_rows <- which(!duplicated(apply(dat[, 1:3], 1, unique)))
out <- dat[unique_rows, ]

Output

> out
  x y z t
1 a a e 4
3 k l m 8

CodePudding user response：

Another way to deal with this would be to take minimum value of t column and keep remaining columns as group in aggregate function.

aggregate(t~., dat, min)

#  x y z t
#1 a a e 3
#2 k l m 8