I have column names, for example they look like this 20819830_r1ar_u_stationary and 2081974_f8ar_u. I am trying to get rid of the first set of numbers in the column names. I tried using this code
names(df)[1:2] <- gsub("^.*_","",names(df[,c(1:2)]))
but when I use this, the column names turn to stationary and u. I can see the code is removing everything up until the last _ how do I change the code so that it removes everything up until the first _.
CodePudding user response:
Instead of matching .* - one or more characters as . matches any characters, it should be one or more digits (\\d ) from the start (^) of the string
names(df)[1:2] <- sub("^\\d _", "", names(df)[1:2])
CodePudding user response:
Another option using stringr. str_remove will remove the first set of digits that are followed by an underscore:
library(stringr)
str="20819830_r1ar_u_stationary"
str_remove(str, "^[0-9] (?=_)_")
[1] "r1ar_u_stationary"
