Home > database >  Custom function to replace duplicates with NA does not work
Custom function to replace duplicates with NA does not work

Time:01-10

This is my function:

my_func <- function(x){
  ifelse(duplicated(x), NA_real_, first(x))
} 

I want to apply it to this vector:

vector <- c(1,1,1,3,3,3)

[1] 1 1 1 3 3 3

My expected output:

[1] 1 NA NA 3 NA NA

I have tried with sapply:

sapply(vector, my_func)

gives: 
[1] 1 1 1 3 3 3

or changed the function to

my_func <- function(x){
  ifelse(duplicated(x), NA_real_, x)
} 

CodePudding user response:

replace_dup = function(x, val = NA_real_) {
  x[duplicated(x)] = val
  x
}

replace_dup(vector)
[1]  1 NA NA  3 NA NA

duplicated(x) will be TRUE for the indices that you want to replace, so you can subset the vector by those indices and replace them.

I don't know why ifelse(duplicated(x), NA_real_, x) wasn't working for you as that is a valid solution too (although slightly more complicated). It works fine when I run it and produces the correct result.

As for sapply() - that would work if you had a list to which you wanted to apply this function:

vectors = list(c(1, 1, 2, 1, 3), c(5, 5, 5))
sapply(vectors, replace_dup)

[[1]]
[1]  1 NA  2 NA  3

[[2]]
[1]  5 NA NA

Edit: As mentioned in the comments - the issue with sapply() here is that the function is already designed to work with an entire vector. sapply(vector, replace_dup) would apply replace_dup() to each individual element of vector, resulting in no duplicates identified:

sapply(vector, replace_dup)
[1] 1 1 1 3 3 3
  •  Tags:  
  • Related