I tried to convert the categorical features in a dataset to factors. However, using apply with as.factor did not work:
convert <- c(2:5, 7:9,11,16:17)
read_file[,convert] <- data.frame(apply(read_file[convert], 2, as.factor))
However, switching to lapply did work:
read_file[,convert] <- data.frame(lapply(read_file[convert], as.factor))
Can someone explain to me what's the difference and why second code works while the first fails?
CodePudding user response:
apply returns a matrix and a matrix cannot contain a factor variable. Factor variables are coerced to character variables if you create a matrix from them. The documentation in help("apply") says:
In all cases the result is coerced by
as.vectorto one of the basic vector types before the dimensions are set, so that (for example) factor results will be coerced to a character array.
lapply returns a list and a list can contain (almost) anything. In fact, a data.frame is just a list with some additional attributes. You don't even need to call data.frame there. You can just subset-assign a list into a data.frame.
