Home > database >  Elimating double Quotes in a Dataframe
Elimating double Quotes in a Dataframe

Time:01-12

I have a data frame containing 27 columns. All these columns have data that has a structure similar to the one below.

principal_amt <- c('"pa": "5975.00"', '"pa": "2285.00"', '"pa": "15822.00"')
closed_accounts <- c( '"ca": 0', '"ca": 3', '"ca": 0')
status <- c(' "loan_status": "Paid" ', ' "loan_status": "Funded"',' "loan_status": "Funded"')
DF <- data.frame(principal_amt, closed_accounts)

I want to automatically remove the double quotes present in the observations so that the final data frame has a structure similar to this.

principal_amt <- c(5975.00, 2285.00, 15822.00)
closed_accounts <- c(0, 3, 0)
status <- c('Paid','Funded','Funded')
DF_Final <- data.frame(principal_amt, closed_accounts)

How do I go about this?

CodePudding user response:

The readr package ships with a handy parse_number function for such use cases.

library(tidyverse)

DF %>%
  mutate(across(.fns = parse_number))

  principal_amt closed_accounts
1          5975               0
2          2285               3
3         15822               0

CodePudding user response:

This will do the job.

principal_amt <- gsub("[^0-9.-]", "", c('"pa": "5975.00"', '"pa": "2285.00"', '"pa": "15822.00"'))
closed_accounts <- gsub("[^0-9.-]", "",c( '"ca": 0', '"ca": 3', '"ca": 0'))
DF <- data.frame(principal_amt, closed_accounts)

CodePudding user response:

Base R

DF <- as.data.frame(apply(
  apply(DF, 2, gsub, pattern = '[^0-9.-]', replacement = ''), 2, as.numeric
))

Output

> str(DF)
'data.frame':   3 obs. of  2 variables:
 $ principal_amt  : num  5975 2285 15822
 $ closed_accounts: num  0 3 0
  •  Tags:  
  • Related