There's a lot of pages that show how to replace a NULL value with either NA or a specific value, but I haven't seen anything that shows how to goes the other way (replacing a value with NULL in a vector). Is there a way to do this? Here's a reprex of my {tidyverse} attempt:
library(tidyverse)
# character vectors can have null values
c("None", "Some", NULL, "Many", "All")
#> [1] "None" "Some" "Many" "All"
# but is there a way to replace a string in a vector with null?
c("None", "Some", "NULL", "Many", "All") %>%
str_replace("NULL", NULL)
#> Error: `replacement` must be a character vector
Created on 2022-01-31 by the reprex package (v2.0.1)
CodePudding user response:
It's worth remembering the difference between NULL and NA. NA values are a dodgy value, NULL is no value whatsoever. In order to get the second output to be the same as the first output, you would have something the same as the following
column <- c("None", "Some", "NULL", "Many", "All")
column <- column[column != "NULL"]
This creates a shorter vector, which is why str_replace doesn't like it.
CodePudding user response:
Character vectors can't contain NULL but we can work around this in several ways.
convert the character vector to a list in which case NULL can be an element
x <- c("None", "Some", "NULL", "Many", "All") x_list <- replace(as.list(x), x == "NULL", list(NULL)) str(x_list) ## List of 5 ## $ : chr "None" ## $ : chr "Some" ## $ : NULL ## $ : chr "Many" ## $ : chr "All"If there are no zero length strings then use that to represent NULL. This is quite common with R providing the
nzcharfunction to test for this -- it returns TRUE for character strings of non-zero length and FALSE otherwise.x <- c("None", "Some", "NULL", "Many", "All") x2 <- replace(x, x == "NULL", "") x2 ## [1] "None" "Some" "" "Many" "All" nzchar(x2) ## [1] TRUE TRUE FALSE TRUE TRUEUse NA instead of NULL.
x <- c("None", "Some", "NULL", "Many", "All") replace(x, x == "NULL", NA) ## [1] "None" "Some" NA "Many" "All"Another approach is to use two vectors. One for the data and one to indicate whether the value is missing. Then the value in the null component can be anything.
x <- c("None", "Some", "NULL", "Many", "All") x_null <- c(FALSE, FALSE, TRUE, FALSE, FALSE)A number of packages can handle multiple types of missing values. The memisc package uses an S4 class,
"character.item", for this.library(memisc) xx <- x missing.values(xx) <- "NULL" xx ## Item (measurement: nominal, type: character, length = 5) ## ## [1:5] None Some *NULL Many All is.missing(xx) ## [1] FALSE FALSE TRUE FALSE FALSEThe naniar package represents different kinds of missing values in a column of a data frame using a second column and the labelled package uses attributes for this.
See the Missing Data Task View for information on other missing value packages as well.
