Home > Back-end >  Reading multiple csv files with specific string value
Reading multiple csv files with specific string value

Time:01-18

I have multiple csv files in my directory but I want to read files with specific strings in filename.

Files:

QA Finance GRM CONS ASPAC_Sales_6698_WI3_2021_ListPrice.csv,
QA Finance GRM CONS ASPAC_Sales_6698_WI4_2021_GrsToNet.csv,
QA Finance GRM CONS ASPAC_Sales_6698_WI3_2021_UnitsChanges.csv

I want to read only files having "List Price" and "Units Changes" at one go.

Tried this:

os.chdir(path=source_path)
all_csv_files = glob.glob("*.csv")
print(all_csv_files)

for file in all_csv_files:
    if ("ListPrice" in file):
        uploadfiles = [f for f in listdir(source_path)
        if isfile(join(source_path, f))]
            print("Upload files:", *uploadfiles, sep='\n')

CodePudding user response:

You can read your specific files with this code, also you can modify all filesName list at one go-

import pandas as pd
filesName = [
    "QA Finance GRM CONS ASPAC_Sales_6698_WI3_2021_ListPrice.csv",
    "QA Finance GRM CONS ASPAC_Sales_6698_WI4_2021_GrsToNet.csv",
    "QA Finance GRM CONS ASPAC_Sales_6698_WI3_2021_UnitsChanges.csv"
]

keywords = [
    "ListPrice",
    "UnitsChanges"
]

for names in filesName:
    for key in keywords:
        if key in names:
            df = pandas.read_csv(names)
            print(df)

CodePudding user response:

If this question is just about filtering a list, there are many StackOverflow posts related to '[python] filter a list', I recommend you check them out.

Specifically for your question, how about "glob-ing" for each kind of file and combining them:

lp_files = glob.glob('*ListPrice.csv')
uc_files = glob.glob('*UnitsChanges.csv')
filtered = lp_files   uc_files

I think that very clearly shows you and anyone else what you want/expect.

If you still want to only glob once, and filter multiple files, I suggest creating a little filter function:

def csv_filter(fname):
    if 'ListPrice' in fname:
        return True
    if 'UnitsChanges' in fname:
        return True
    # if 'SomeOtherText' in fname:
    #     return True

    return False

You can add and remove file names from that list very easily; just leave the final return False for any file that doesn't match your filters.

You can call that from a one-line list comprehension:

all_csv_files = glob.glob('*.csv')
filtered = [x for x in all_csv_files if csv_filter(x)]

which is equivalent to this more traditional for-loop, which you were trying:


filtered = []
for x in all_csv_files:
    if  csv_filter(x):
        filtered.append(x)

# Now, do something with filtered CSVs
# ...

Also, there might something else wrong here, but I'm not sure:

if ("ListPrice" in file):
    uploadfiles = [f for f in listdir(source_path)
    if isfile(join(source_path, f))]
        print("Upload files:", *uploadfiles, sep='\n')

You have the condition that if file matches your 'ListPrice' filter, do something, but you are not doing anything with file. There may be some broader implication of having found a 'ListPrice' file in source_path, but I'd expect to see file used somewhere inside that if-statement.

  •  Tags:  
  • Related