I have multiple csv files in my directory but I want to read files with specific strings in filename.
Files:
QA Finance GRM CONS ASPAC_Sales_6698_WI3_2021_ListPrice.csv,
QA Finance GRM CONS ASPAC_Sales_6698_WI4_2021_GrsToNet.csv,
QA Finance GRM CONS ASPAC_Sales_6698_WI3_2021_UnitsChanges.csv
I want to read only files having "List Price" and "Units Changes" at one go.
Tried this:
os.chdir(path=source_path)
all_csv_files = glob.glob("*.csv")
print(all_csv_files)
for file in all_csv_files:
if ("ListPrice" in file):
uploadfiles = [f for f in listdir(source_path)
if isfile(join(source_path, f))]
print("Upload files:", *uploadfiles, sep='\n')
CodePudding user response:
You can read your specific files with this code, also you can modify all filesName list at one go-
import pandas as pd
filesName = [
"QA Finance GRM CONS ASPAC_Sales_6698_WI3_2021_ListPrice.csv",
"QA Finance GRM CONS ASPAC_Sales_6698_WI4_2021_GrsToNet.csv",
"QA Finance GRM CONS ASPAC_Sales_6698_WI3_2021_UnitsChanges.csv"
]
keywords = [
"ListPrice",
"UnitsChanges"
]
for names in filesName:
for key in keywords:
if key in names:
df = pandas.read_csv(names)
print(df)
CodePudding user response:
If this question is just about filtering a list, there are many StackOverflow posts related to '[python] filter a list', I recommend you check them out.
Specifically for your question, how about "glob-ing" for each kind of file and combining them:
lp_files = glob.glob('*ListPrice.csv')
uc_files = glob.glob('*UnitsChanges.csv')
filtered = lp_files uc_files
I think that very clearly shows you and anyone else what you want/expect.
If you still want to only glob once, and filter multiple files, I suggest creating a little filter function:
def csv_filter(fname):
if 'ListPrice' in fname:
return True
if 'UnitsChanges' in fname:
return True
# if 'SomeOtherText' in fname:
# return True
return False
You can add and remove file names from that list very easily; just leave the final return False for any file that doesn't match your filters.
You can call that from a one-line list comprehension:
all_csv_files = glob.glob('*.csv')
filtered = [x for x in all_csv_files if csv_filter(x)]
which is equivalent to this more traditional for-loop, which you were trying:
filtered = []
for x in all_csv_files:
if csv_filter(x):
filtered.append(x)
# Now, do something with filtered CSVs
# ...
Also, there might something else wrong here, but I'm not sure:
if ("ListPrice" in file):
uploadfiles = [f for f in listdir(source_path)
if isfile(join(source_path, f))]
print("Upload files:", *uploadfiles, sep='\n')
You have the condition that if file matches your 'ListPrice' filter, do something, but you are not doing anything with file. There may be some broader implication of having found a 'ListPrice' file in source_path, but I'd expect to see file used somewhere inside that if-statement.
