I'm using RStudio to perform some analysis.
I have this data frame:
| Residue | Energy | Model |
|---|---|---|
| R-A-40 | -3.45 | DELTA |
| R-A-350 | -1.89 | DELTA |
| R-B-468 | -0.25 | DELTA |
| R-C-490 | -2.67 | DELTA |
| R-A-610 | -1.98 | DELTA |
I would like to filter the first column ("Residue") based on the numeric values (between 300 to 500) and create a new data frame. The new data frame would be like this:
| Residue | Energy | Model |
|---|---|---|
| R-A-350 | -1.89 | DELTA |
| R-B-468 | -0.25 | DELTA |
| R-C-490 | -2.67 | DELTA |
Note that it does not matter if starts with "R-A-", "R-B-" or "R-C-". However, I have different patterns (not only these three). I have to ignore the non-numeric characters or the first four characters from "Residue" column.
I did not find any similar question. I appreciate any help!
Thanks an advance
CodePudding user response:
An approach using stringrs str_extract
library(stringr)
val <- as.numeric(str_extract(df$Residue, "[[:digit:]] "))
df[val > 300 & val < 500,]
Residue Energy Model
2 R-A-350 -1.89 DELTA
3 R-B-468 -0.25 DELTA
4 R-C-490 -2.67 DELTA
Data
df <- structure(list(Residue = c("R-A-40", "R-A-350", "R-B-468", "R-C-490",
"R-A-610"), Energy = c(-3.45, -1.89, -0.25, -2.67, -1.98), Model = c("DELTA",
"DELTA", "DELTA", "DELTA", "DELTA")), class = "data.frame", row.names = c(NA,
-5L))
CodePudding user response:
# convert character to integer
df$x <- as.integer(substr(df$Residue, 5, nchar(df$Residue)))
# subset
df[df$x 