How to identify unique values from a csv with row breaks in R-CodePudding

I have a csv with a line break that I import to R using the read.csv feature and I want to identify the unique values in one of the columns. For example my example.csv file looks like this:

I believe I can use unique after I delete the empty row which I do like this:

df <- read.csv(file = "example.csv",header = FALSE)

colnames(df)[1:2] <- c("path","group")
df <- df[!(df$path=="" | df$group==""), ]


unique_groups <- unique(df$group)

However, unique_groups (despite only having 3 different groups), turns out to be a factor with 4 levels, my 3 different groups and then blank or "".

I have figured out that if I save df as a csv right before the unique_groups step and the read back in that csv, it works fine and then unique_groups is a factor with 3 levels, but I'm wondering if there is a more efficient way to do this? Am I doing something wrong with the initial import or how I remove the blank row?

Any help is appreciated - thanks!

CodePudding user response：

Turns out when I imported my csv, it was classifying column B as a factor which was causing my problems. So now I just import it as a character class and that has resolved my issue, like so:

df <- read.csv(file = "example.csv",header = FALSE, colClasses = c("character","character"))