Filter rows in a specific range values containing character and number in R-CodePudding

I'm using RStudio to perform some analysis.

I have this data frame:

Residue	Energy	Model
R-A-40	-3.45	DELTA
R-A-350	-1.89	DELTA
R-B-468	-0.25	DELTA
R-C-490	-2.67	DELTA
R-A-610	-1.98	DELTA

I would like to filter the first column ("Residue") based on the numeric values (between 300 to 500) and create a new data frame. The new data frame would be like this:

Residue	Energy	Model
R-A-350	-1.89	DELTA
R-B-468	-0.25	DELTA
R-C-490	-2.67	DELTA

Note that it does not matter if starts with "R-A-", "R-B-" or "R-C-". However, I have different patterns (not only these three). I have to ignore the non-numeric characters or the first four characters from "Residue" column.

I did not find any similar question. I appreciate any help!

Thanks an advance

CodePudding user response：

An approach using stringrs str_extract

library(stringr)

val <- as.numeric(str_extract(df$Residue, "[[:digit:]] "))

df[val > 300 & val < 500,]
  Residue Energy Model
2 R-A-350  -1.89 DELTA
3 R-B-468  -0.25 DELTA
4 R-C-490  -2.67 DELTA

Data

df <- structure(list(Residue = c("R-A-40", "R-A-350", "R-B-468", "R-C-490", 
"R-A-610"), Energy = c(-3.45, -1.89, -0.25, -2.67, -1.98), Model = c("DELTA", 
"DELTA", "DELTA", "DELTA", "DELTA")), class = "data.frame", row.names = c(NA, 
-5L))

CodePudding user response：

# convert character to integer
df$x <- as.integer(substr(df$Residue, 5, nchar(df$Residue)))

# subset
df[df$x