Home > Blockchain >  Adding a column to multiple .csv files with the file name as you combine those .csv files into a sin
Adding a column to multiple .csv files with the file name as you combine those .csv files into a sin

Time:01-29

I have 50 .csv files with over 188k rows combined that I would need to add the file name to so that I am able to track which file it came from. I have included the code I am using below which is able to combine the files into a single df.

df = pd.DataFrame()
for file in files:
    if file.endswith('.csv'):
        df=df.append(pd.read_csv(file), ignore_index=True)
df.head()

CodePudding user response:

You're almost there. Instead of appending directly the result of the read_csv(), store it and add a new column with the file name

for file in files:
    if file.endswith('.csv'):
        df_new = pd.read_csv(file)
        df_new['from_file'] = file
        df = df.append(df_new, ignore_index=True)

Also if your file variable is actually the whole path to the file, you can use os.path.basename(file) which return the name of the file only, without the path.

  •  Tags:  
  • Related