read csv file from list but add the filename as identifier-CodePudding

I have list of csv files. I could read them all using read_csv.

But I would like to add the filename as identifier. How do I do it ?

library(tidyverse)

# read file names
csv_filenames <- list.files(path = "OMITTED FOR THIS EXAMPLE", 
                            full.names = TRUE)


###
csv_filenames are "One.csv", "Two.csv", "Three.csv", ....
###

# read csv files
df <- read_csv(csv_filenames)

CodePudding user response：

read_csv has an argument id = ; if you specify "path", you get a column named "path" with the file names:

csv_data <- read_csv(csv_filenames, id = "path")

If you wanted just the base file name, you could add a dplyr::mutate step:

library(dplyr)
csv_data <- read_csv(csv_filenames, id = "path") %>%
  mutate(path = basename(path))

CodePudding user response：

You should be able to use assign with basename in a for loop.

for(i in seq_along(csv_filenames)){
  assign(basename(csv_filenames)[i], read.csv(csv_filenames[i]))
}

Using basename will assign a new object in the global environment with the name of the file in the folder (not the whole file path obtained with full.names = TRUE).

CodePudding user response：

library(dplyr)

# list of file names
file_list <- list.files(path = "path/to/csv/files", pattern = "*.csv")

# read in all files and add the file name as an additional column
data_list <- lapply(file_list, function(x) {
  data <- read.csv(file = x, stringsAsFactors = FALSE) %>%
    mutate(file_name = x)
  return(data)
})

CodePudding user response：

With R base

csv_files <- lapply(csv_filenames, read.csv)
file_names <- sub("\\..*", "", basename(csv_filenames))
out <- lapply(1:length(csv_files), function(i){
  transform(csv_files[[i]],  file_name = file_names[i])
})
do.call(rbind, out)