I have a variable v as follows:
> head(v)
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
1 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2
2 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1
3 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1 0.p1
4 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1 0.m1
5 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2
6 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2 0.m2
I want to get ride of the trailing ".m1" for each element of the dataframe.
When I do lapply, it gives me a list of C1, C2, ...
> lapply(v, function(x) as.numeric(gsub("\\..*", "", x))) %>% str
List of 10
$ C1 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C2 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C3 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C4 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C5 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C6 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C7 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C8 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C9 : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
$ C10: num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
However, I want a dataframe with the dimension staying the same, so I do the following and it works
> v[]=lapply(v, function(x) as.numeric(gsub("\\..*", "", x)))
> head(v)
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
1 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0 0 0
Is [] needed to change each element of v? Is there a better way to code this? Thank you.
CodePudding user response:
vis adata.frame, which is essentially (but not perfectly) alistwhere all elements are named and are the same lengths.lapplyalways returns alist. Period. It doesn't care that its input came from aframe, that is not its intent.v = lapply(..)replaces the object reference named"v"with the new object, which is (as stated above) alist. However ...v[] = lapply(..)replaces the contents ofvwith the return fromlapply(..)without changing the class and attributes ofv, so it remains a frame with the list-contents returned bylapply. Realize that the same effect can be had withv = data.frame(lapply(..)).
CodePudding user response:
Since you are asking what other, possibly better, method there would be, here's a tidyverse solution:
library(tidyverse)
df %>%
mutate(across(everything(), ~as.numeric(str_extract(., "\\d "))))
C1 C2 C3
1 0 0 0
2 0 0 0
3 0 0 0
Instead of gsub(or better sub, as we're dealing with a single match per string), as in your solution (which is of course possible too but slightly more verbose), we're using here str_extract to extract the string-first digit(s).
Data:
df <- data.frame(
C1 = c("0.m1", "0.p2", "0.p1"),
C2 = c("0.p1", "0.p0", "0.p1"),
C3 = c("0.m2", "0.p1", "0.p1")
)
