How to remove a certain portion of the column name in a dataframe?-CodePudding

I have column names in the following format:

col= c('UserLanguage','Q48','Q21...20','Q22...21',"Q22_4_TEXT...202")

I would like to get the column names without everything that is after ...

[1] "UserLanguage"    "Q48"             "Q21"        "Q22"        "Q22_4_TEXT"

I am not sure how to code it. I found this post here but I am not sure how to specify the pattern in my case.

CodePudding user response：

You can use gsub.

gsub("\\....*","",col)

#[1] "UserLanguage" "Q48"     "Q21"        "Q22"     "Q22_4_TEXT"

Or you can use stringr

library(stringr)

str_remove(col, "\\....*")

CodePudding user response：

You could sub and capture the first word in each column:

col <- c("UserLanguage", "Q48", "Q21...20", "Q22...21", "Q22_4_TEXT...202")
sub("^(\\w ).*$", "\\1", col)

[1] "UserLanguage" "Q48"          "Q21"          "Q22"          "Q22_4_TEXT"