Home > Blockchain >  How do I loop through a file path in glob.glob to create multiple files at once?
How do I loop through a file path in glob.glob to create multiple files at once?

Time:02-03

I have 10 different folder paths that I run this code through. Instead of changing them manually, I am trying to create a function to loop thru changing the file path to save time. Also, can you show me a way to disable glob.glob package overwriting the file? For example, If I ran this code once, it creates one combined file of the folder path files. If I run this twice (on accident), it duplicates the rows in the csv. For example, .csv1 has 100 rows after running code. After running it twice, it has 200 rows and has a duplication of every row. I am trying to write the code to overwrite the previous file and not have duplications because I store this in a server.

So I have 10 of these codes written out to go to separate file locations. Instead of running them separately, I want to loop them through this code to create multiple files at once.

# Change File Path to personal directory folder
os.chdir("C:/Users/File.csv")

extension = 'csv'
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]

# Using Pandas to combine all files in the list

#combine all files in the list
combined_csv = pd.concat([pd.read_csv(f) for f in all_filenames ])
#export to csv
combined_csv.to_csv( "File.csv", index=False, encoding='utf-8')

CodePudding user response:

You should ignore File.csv when processing the list, so you don't append it to itself.

import os

combined_csv = pd.concat([pd.read_csv(f) for f in all_filenames if os.path.basename(f) != 'File.csv' ])

CodePudding user response:

Put your code in a function to make it easier to reuse and more readable.

def combine_csvs(path, output_name="File.csv"):
    filenames = glob.glob("*.csv", root_dir=path)
    combined = pd.concat([pd.read_csv(os.path.join(path, f)) for f in filenames if f != output_name])
    combined.to_csv(os.path.join(path, output_name))

Then the loop is simply:

for path in my_paths:
    combine_csvs(path)
  •  Tags:  
  • Related