I am using rstudio, I have this column as you can see from the photo that has values divided by underscores, I would like to make a new column containing only the values that are after the underscore, what formula should I use?

CodePudding user response：

base R

df$new <- substr(df$post_name, regexpr("_", df$post_name) 1, length(df$post_name))

Or with data.table

# load package
library(data.table)

# set dataframe as datatable
setDT(df)

# create new column
df[, new := substr(post_name, regexpr("_", post_name) 1, length(post_name))]

CodePudding user response：

Using sub if it's guaranteed that only one underscore appears

df$new <- sub(".*_","",df$post_name)
df
               post_name     new
1 3433243juhy234_2323526 2323526
2 3433243juhy234_2323526 2323526
3 3433243juhy234_2323526 2323526

Data

df <- data.frame(post_name=c("3433243juhy234_2323526", 
  "3433243juhy234_2323526", "3433243juhy234_2323526"))

CodePudding user response：

Use extractfrom tidyr:

library(tidyr)
df %>%
  extract(post_name,          # identify column from which to extract
          into = "right",     # give new column a name     
          regex = ".*_(.*)")  # wrap what you want to extract in capture group `(...)`