Is there a way to extract numbers from the strings that appear last
asd <- c("asdf sfsfsd54 sdfsdfsdf sdfsdfsf654")
asd1 <- c("asdf sfsfsd54 sdfsdfsdf sdfsdfsf65421")
Expected output
new_asd
654
new_asd1
65421
CodePudding user response:
This code extracts always the last numeric entries in a string:
(stringr::str_extract(asd, stringr::regex("(\\d )(?!.*\\d)")))
"654"
(stringr::str_extract(asd1, stringr::regex("(\\d )(?!.*\\d)")))
"65421"
If you want to get only the number when the very last character of the string is a number then you could implement a simple ifelse condition to check for that specifically, e.g.:
x<- c("asdf sfsfsd54 sdfsdfsdf sdfsdfsf654f")
ifelse(!is.na(as.numeric(substr(x, nchar(x), nchar(x)))),
(stringr::str_extract(x, stringr::regex("(\\d )(?!.*\\d)"))),
NA)
NA #returns NA because last entry of string is not numeric ("f")
CodePudding user response:
I would use sub combined with ifelse here:
x <- c("asdf sfsfsd54 sdfsdfsdf sdfsdfsf654", "abc", "123")
nums <- ifelse(grepl("\\d$", x), sub(".*?(\\d )$", "\\1", x), "")
nums
[1] "654" "" "123"
CodePudding user response:
One solution which first splits the string based on whitespace, then gets the last substring and removes any letters. This should work as long as there is only letters and numbers in the strings.
library(stringr)
get_num = function(x) {
str_remove_all(rev(unlist(str_split(x, " ")))[1], "[a-z]")
}
> get_num(asd)
[1] "654"
> get_num(asd1)
[1] "65421"
CodePudding user response:
You can do something like this:
library(stringr)
val <- str_extract_all(asd1, "\\d ")[[1]]
tail(val, 1)
"65421"
OR
as.numeric(gsub("[^\\d] ", "", asd, perl=TRUE))
val <- regmatches(asd1, gregexpr("[[:digit:]] ", asd1))[[1]]
tail(val, 1)
"65421"
CodePudding user response:
If string always ends with digits, then we can try gsub
> x <- c("asdf sfsfsd54 sdfsdfsdf sdfsdfsf654", "asdf sfsfsd54 sdfsdfsdf sdfsdfsf65421")
> gsub(".*\\D", "", x, perl = TRUE)
[1] "654" "65421"
CodePudding user response:
A single regex is sufficient for your situation.
stringr::str_extract(asd, "(\\d $)")
The $ anchors the capture group to the end of the string.
