I am using rstudio, I have this column as you can see from the photo that has values divided by underscores, I would like to make a new column containing only the values that are after the underscore, what formula should I use?
CodePudding user response:
base R
df$new <- substr(df$post_name, regexpr("_", df$post_name) 1, length(df$post_name))
Or with data.table
# load package
library(data.table)
# set dataframe as datatable
setDT(df)
# create new column
df[, new := substr(post_name, regexpr("_", post_name) 1, length(post_name))]
CodePudding user response:
Using sub if it's guaranteed that only one underscore appears
df$new <- sub(".*_","",df$post_name)
df
post_name new
1 3433243juhy234_2323526 2323526
2 3433243juhy234_2323526 2323526
3 3433243juhy234_2323526 2323526
Data
df <- data.frame(post_name=c("3433243juhy234_2323526",
"3433243juhy234_2323526", "3433243juhy234_2323526"))
CodePudding user response:
Use extractfrom tidyr:
library(tidyr)
df %>%
extract(post_name, # identify column from which to extract
into = "right", # give new column a name
regex = ".*_(.*)") # wrap what you want to extract in capture group `(...)`

