Home > Blockchain >  Separate column values ​after underscores and create new column
Separate column values ​after underscores and create new column

Time:02-03

I am using rstudio, I have this column as you can see from the photo that has values ​​divided by underscores, I would like to make a new column containing only the values ​​that are after the underscore, what formula should I use?

enter image description here

CodePudding user response:

base R

df$new <- substr(df$post_name, regexpr("_", df$post_name) 1, length(df$post_name))

Or with data.table

# load package
library(data.table)

# set dataframe as datatable
setDT(df)

# create new column
df[, new := substr(post_name, regexpr("_", post_name) 1, length(post_name))]

CodePudding user response:

Using sub if it's guaranteed that only one underscore appears

df$new <- sub(".*_","",df$post_name)
df
               post_name     new
1 3433243juhy234_2323526 2323526
2 3433243juhy234_2323526 2323526
3 3433243juhy234_2323526 2323526

Data

df <- data.frame(post_name=c("3433243juhy234_2323526", 
  "3433243juhy234_2323526", "3433243juhy234_2323526"))

CodePudding user response:

Use extractfrom tidyr:

library(tidyr)
df %>%
  extract(post_name,          # identify column from which to extract
          into = "right",     # give new column a name     
          regex = ".*_(.*)")  # wrap what you want to extract in capture group `(...)`
  •  Tags:  
  • Related