Here is the data:
| Subject code | Name |
|---|---|
| 401 | John |
| 422 | Mary |
| 463 | Peter |
And I would like to create unique id based on the last two digit of the subject code. For example:
| ID | Subject code | Name |
|---|---|---|
| S01 | 401 | John |
| S22 | 422 | Mary |
| S63 | 463 | Peter |
Which library should I use? Should I use case_when() in this situation?
CodePudding user response:
You can use str_extractand str_c from the stringr package:
library(tidyverse)
df %>%
mutate(ID = str_c("S", str_extract(Subject_code, "\\d{2}$")))
Subject_code ID
1 401 S01
2 422 S22
3 463 S63
The regex pattern \\d{2}$ matches the two digits that occur in string-final ($) position and extracts them.
Data:
df <- data.frame(
Subject_code = c(401, 422, 463))
CodePudding user response:
You can use substr paste0:
data$ID <- paste0("S", substr(data$`Subject code`, 2, 3))
e.g.:
paste0("S", substr(431, 2, 3))
#[1] "S31"
or in dplyr:
library(dplyr)
data %>%
mutate(ID = paste0("S", substr(`Subject code`, 2, 3))
CodePudding user response:
We can try sub like below
> transform(df, ID = sub(".", "s", SubjectCode))[c(3, 1, 2)]
ID SubjectCode Name
1 s01 401 John
2 s22 422 Mary
3 s63 463 Peter
