Home > Blockchain >  Filter rows in a specific range values containing character and number in R
Filter rows in a specific range values containing character and number in R

Time:02-03

I'm using RStudio to perform some analysis.

I have this data frame:

Residue Energy Model
R-A-40 -3.45 DELTA
R-A-350 -1.89 DELTA
R-B-468 -0.25 DELTA
R-C-490 -2.67 DELTA
R-A-610 -1.98 DELTA

I would like to filter the first column ("Residue") based on the numeric values (between 300 to 500) and create a new data frame. The new data frame would be like this:

Residue Energy Model
R-A-350 -1.89 DELTA
R-B-468 -0.25 DELTA
R-C-490 -2.67 DELTA

Note that it does not matter if starts with "R-A-", "R-B-" or "R-C-". However, I have different patterns (not only these three). I have to ignore the non-numeric characters or the first four characters from "Residue" column.

I did not find any similar question. I appreciate any help!

Thanks an advance

CodePudding user response:

An approach using stringrs str_extract

library(stringr)

val <- as.numeric(str_extract(df$Residue, "[[:digit:]] "))

df[val > 300 & val < 500,]
  Residue Energy Model
2 R-A-350  -1.89 DELTA
3 R-B-468  -0.25 DELTA
4 R-C-490  -2.67 DELTA

Data

df <- structure(list(Residue = c("R-A-40", "R-A-350", "R-B-468", "R-C-490", 
"R-A-610"), Energy = c(-3.45, -1.89, -0.25, -2.67, -1.98), Model = c("DELTA", 
"DELTA", "DELTA", "DELTA", "DELTA")), class = "data.frame", row.names = c(NA, 
-5L))

CodePudding user response:

# convert character to integer
df$x <- as.integer(substr(df$Residue, 5, nchar(df$Residue)))

# subset
df[df$x            
  •  Tags:  
  • Related