I'm stuck with a file where someone didn't escape commas inside labels.
Here's an example:
library(tidyverse)
t1 <- c("1,0.259524,0.594196,0.305349,$15,000 - $19,999,Unknown",
"2,0.673729,0.249742,0.729358,Greater than $124,999,College")
The commas are used to separate columns, and but they're also showing up inside the dollars field.
I can match the commas which are my problem
t1 %>%
str_extract_all(
"\\$\\d{2,3},\\d{3}"
)
returns
[[1]]
[1] "$15,000" "$19,999"
[[2]]
[1] "$124,999"
How do I operate on each row, removing only the commas inside that label?
CodePudding user response:
You could use gsub to get rid of the commas:
t1 <- c("1,0.259524,0.594196,0.305349,$15,000 - $19,999,Unknown",
"2,0.673729,0.249742,0.729358,Greater than $124,999,College")
gsub("(\\$\\d ),(\\d{3})", "\\1\\2", t1)
#> [1] "1,0.259524,0.594196,0.305349,$15000 - $19999,Unknown"
#> [2] "2,0.673729,0.249742,0.729358,Greater than $124999,College"
