Changing character length of list columns in R-CodePudding

In R, I have a table with headers and each column has a different character lengths. i.e.

#   Level1   Level2   Level3
#1   a       d         e
#2   b       *blank*   f
#3   c       *blank*   *blank*

This is the code I read in to covert my df to a list.

df=read.csv("list.csv", header=TRUE, sep = ",")
lst1=list() 
for(i in 1:ncol(df)) {      
  lst1[[i]] <- df[ , i]    
}
names(lst1)=colnames(df)
print(lst1)
str(lst1)

However, I receive a list with the same character length. i.e.

List of 3
 $ level1: chr [1:3] "a" "b" "c"
 $ level2: chr [1:3] "d" "" ""
 $ level3: chr [1:3] "e" "f" ""

Is there a way to altering the list so the characters reflect the actual list length for each of the 3 object?

Many thanks.

CodePudding user response：

You can try this, it's going to remove all the ""s in your elements of the list:

result <- lapply(lst1, function(x) x[nzchar(x)])
str(result)
List of 3
 $ Level1: chr [1:3] "a" "d" "e"
 $ Level2: chr [1:2] "b" "f"
 $ Level3: chr "c"

If it's what you need.

You may consider also to avoid a for loop to have lst1:

lst1 <- split(t(df), rownames(t(df)))
# Then apply the code above.
result <- lapply(lst1, function(x) x[nzchar(x)])

With data:

df <- structure(list(Level1 = c("a", "d", "e"), Level2 = c("b", "", 
"f"), Level3 = c("c", "", "")), class = "data.frame", row.names = c(NA, 
-3L))

CodePudding user response：

From the dataframe you can modify your code like this. It checks each element of the list and drops the blank and/or NA values

lst1=list() 
for(i in 1:ncol(df)) {      
    lst1[[i]] <- df[ , i]    
}

lst1 <- lapply(lst1, function(z){ z[!is.na(z) & z != ""]}) # this checks for blank and/or NA

names(lst1)=colnames(df)
print(lst1)
str(lst1)

CodePudding user response：

If you want to stick to your way you can subset df on both rows and columns.

lst1=list()
for(i in 1:ncol(df)) {      
  lst1[[i]] <- df[df[ , i] != '' , i]
}

names(lst1)=colnames(df)

However, a much more concise answer is this neat one liner to handle it all

map(df, function(x) x[nzchar(x)])

$Level1
[1] "a" "b" "c"

$Level2
[1] "d"

$Level3
[1] "e" "f"

structure(list(Level1 = c("a", "b", "c"), Level2 = c("d", "", 
""), Level3 = c("e", "f", "")), class = "data.frame", row.names = c(NA, 
-3L))

CodePudding user response：

lapply(df, function(x) x[x != ""])

$Level1
[1] "a" "d" "e"

$Level2
[1] "b" "f"

$Level3
[1] "c"

data

df <- structure(list(Level1 = c("a", "d", "e"), Level2 = c("b", "", 
"f"), Level3 = c("c", "", "")), class = "data.frame", row.names = c(NA, 
-3L))