I would like to exclude string ""and "<"
year name
1 <b>abc<
2 <b>judy<
3 <b>lin<
I would like the output to look like this:
year name
1 abc
2 judy
3 lin
CodePudding user response:
We can use sub here:
df$name <- sub("^<b>(.*)<$", "\\1", df$name)
CodePudding user response:
You can str_extract the part you're interested in:
library(stringr)
df$name <- str_extract(df$name, "(?<=<b>)[^<>] (?=<)")
How this works:
(?<=<b>): if you see<b>on the left (positive lookbehind) ...[^<>]: ... match any chars one or more times that are not<or>provided ...(?=<): ... you also see<on the right (positive lookahead)
