I am trying to randomly replace 1000 NA values in a dataframe column with 0s. The column is composed only of NAs and 1s and it looks like this:
Column
1 NA
2 1
3 NA
4 NA
5 NA
6 1
7 NA
...
I want it to look something like this:
Column
1 0
2 1
3 NA
4 0
5 NA
6 1
7 NA
...
The column I am working with has more than 1000 rows, so there will be space for 0s and NAs in the end.
I tried something like this:
is.na(df_col[sample(seq(nrow(is.na(df_col))), 1000), "Column"]) <- 0
This, however, does not work. No NA values are replaced. If I take out the is.na()s it works, but the values 1 might get replaced and I do not want that. Do you know how to solve this?
CodePudding user response:
I am assuming that you want to replace 1,000 NA values rather than choosing 1,000 indices and replacing them if they are NA. The following code finds the indices of NA values, then replaces a random sample of 1,000 of those indices with 0.
set.seed(123)
df <- tibble(x = rep(c(1, NA), times = 2000))
indices <- which(is.na(df$x))
df[sample(indices, 1000, replace = FALSE), "x"] <- 0
