I have a dataset that looks like this:
| output |
|---|
| Others. Specify (separate by comma if there is more than one): |
| Everyone cries/has feelings,Others. Specify (separate by comma if there is more than one): |
| Family upbringing |
| Everyone cries/has feelings,Others. Specify (separate by comma if there is more than one): |
| Did not say |
How can I remove the sentence "Others. Specify (separate by comma if there is more than one):" from the dataset? I've tried
gsub("Others. Specify (separate by comma if there is more than one):", "", datset$output)
and str_remove_all() but it didn't work.
CodePudding user response:
You could achieve your desired result by adding fixed=TRUE, which means to match the pattern as is
gsub("Others. Specify (separate by comma if there is more than one):",
"",
datset$output,
fixed = TRUE)
#> [1] "" "Everyone cries/has feelings,"
#> [3] "Family upbringing" "Everyone cries/has feelings,"
#> [5] "Did not say"
Second option would be to escape all special characters which in your case are the . and in particualar the (), e.g. in a regex () are used to create a capturing group. Hence to match a e.g. ( you have to use \\(:
gsub("Others\\. Specify \\(separate by comma if there is more than one\\):", "", datset$output)
DATA
datset <- data.frame(
output = c(
"Others. Specify (separate by comma if there is more than one):",
"Everyone cries/has feelings,Others. Specify (separate by comma if there is more than one):", "Family upbringing",
"Everyone cries/has feelings,Others. Specify (separate by comma if there is more than one):", "Did not say"
)
)
