The following code works great for concatenating multiple .csv files into one. All of these .csv files reside in the same directory. The problem is that it only works if my current file is in the same directory as those multiple .csv files. I have tried different syntaxes to specify the path to the directory where those mutliple .csv files reside, but no success. I wondered where I should specify the path to the .csv files directory in the below code. Btw, I am working in Jupyter Notebook:
import pandas as pd
import os
filepaths = [f for f in os.listdir(".") if f.endswith('.csv')]
df = pd.concat(map(pd.read_csv, filepaths)
CodePudding user response:
os.listdit(dir) lists the files in the path dir. In your example, you have dir='.', which corresponds to the current working directory (the directory from where you run your script). You can change this variable to the directory where your .csv files reside.
import pandas as pd
import os
base_dir = os.path.join('path', 'to', 'files')
filepaths = [f for f in os.listdir(base_dir) if f.endswith('.csv')]
df = pd.concat(map(pd.read_csv, filepaths)
Slower version with globbing
You can avoid using endswith() by globbing,
import pandas as pd
import os
import glob
base_dir = os.path.join('path', 'to', 'files')
filepaths = [f for f in glob.glob(f'{base_dir}*.csv')]
df = pd.concat(map(pd.read_csv, filepaths))
This expands the wildcard * to find all files that ends with .csv in base_dir.
CodePudding user response:
In os.listdir, "." represents current directory.
You can specify instead os.path.join('.', 'subdir', 'subsubdir') to list files in subdir/subsubdir/.
Complete code.
import pandas as pd
import os
filepaths = [f for f in os.listdir(os.path.join('.', 'subdir', 'subsubdir')) if f.endswith('.csv')]
df = pd.concat(map(pd.read_csv, filepaths)
CodePudding user response:
Just replace os.listdir(".") with os.listdir("csvFilesPath") as indicated in the previous answer it works. I tried it
