Home > database >  Convert decimal to thousands separator for specific cases in a data.frame in R
Convert decimal to thousands separator for specific cases in a data.frame in R

Time:01-29

I would like to convert decimal points in columns of a vast data.frame into thousands separators.

For example, I need to convert data similar to:

enter image description here

On data similar to:

enter image description here

I didn't find similar questions on the platform. I appreciate any help.

Here's the data in the example:

structure(list(x1 = c(2.678, 135.613, 6.082, 30.221, 85.809, 
160.804, 173.38, 3.323, 0, 597.91), x2 = c("16.282", "2.636.486", 
"95.2", "514.364", "2.226.858", "4.283.662", "1.565.147", "375.16", 
"4", "11.713.163")), row.names = c(NA, 10L), class = "data.frame")

CodePudding user response:

I'll use ifelse() to identify rows with more than 1 dots ., then remove the first . of these rows and multiply by 1000. If it contains only one ., just multiply the number by 1000.

library(string)
library(tidyverse)

# there would be warning message if you include the as.numeric() in the ifelse(), 
# therefore I separated it in two operations
df %>% mutate(across(everything(),
                     ~ ifelse(str_count(.x, "\\.") > 1, 
                              sub("\\.", "", .x), 
                              .x)),
              across(everything(), 
                     ~ as.numeric(.x) * 1000))

      x1       x2
1    2678    16282
2  135613  2636486
3    6082    95200
4   30221   514364
5   85809  2226858
6  160804  4283662
7  173380  1565147
8    3323   375160
9       0     4000
10 597910 11713163

One of the downside of this approach is that if it contains more than two dots, it probably won't work.

UPDATE: I have borrowed idea from @Dion Groothof to improve this code, now it should work regardless of the number of dots

library(string)
library(tidyverse)

df %>% mutate(across(everything(), ~ as.numeric(gsub("\\.", "", .x))),
              across(everything(), ~ ifelse(.x < 100, .x * 1000, .x)))

      x1       x2
1    2678    16282
2  135613  2636486
3    6082    95200
4   30221   514364
5   85809  2226858
6  160804  4283662
7  173380  1565147
8    3323   375160
9       0     4000
10 597910 11713163

CodePudding user response:

Using base R, this will work too.

out <- data.frame(apply(df, 2, function(x) {
  x <- as.numeric(gsub('\\.', '', x))
  ifelse(x < 100, x * 1000, ifelse(x < 1000, x * 100, x))
}))

Output

> out
       x1       x2
1    2678    16282
2  135613  2636486
3    6082    95200
4   30221   514364
5   85809  2226858
6  160804  4283662
7  173380  1565147
8    3323    37516
9       0     4000
10 597910 11713163
  •  Tags:  
  • Related