Home > OS >  Jupyter Notebook specify path to directory for concatenation of multiple .csv files
Jupyter Notebook specify path to directory for concatenation of multiple .csv files

Time:01-20

The following code works great for concatenating multiple .csv files into one. All of these .csv files reside in the same directory. The problem is that it only works if my current file is in the same directory as those multiple .csv files. I have tried different syntaxes to specify the path to the directory where those mutliple .csv files reside, but no success. I wondered where I should specify the path to the .csv files directory in the below code. Btw, I am working in Jupyter Notebook:

import pandas as pd
import os

filepaths = [f for f in os.listdir(".") if f.endswith('.csv')]
df = pd.concat(map(pd.read_csv, filepaths)

CodePudding user response:

os.listdit(dir) lists the files in the path dir. In your example, you have dir='.', which corresponds to the current working directory (the directory from where you run your script). You can change this variable to the directory where your .csv files reside.

import pandas as pd
import os

base_dir = os.path.join('path', 'to', 'files')
filepaths = [f for f in os.listdir(base_dir) if f.endswith('.csv')]
df = pd.concat(map(pd.read_csv, filepaths)

Slower version with globbing

You can avoid using endswith() by globbing,

import pandas as pd
import os
import glob

base_dir = os.path.join('path', 'to', 'files')
filepaths = [f for f in glob.glob(f'{base_dir}*.csv')]
df = pd.concat(map(pd.read_csv, filepaths))

This expands the wildcard * to find all files that ends with .csv in base_dir.

CodePudding user response:

In os.listdir, "." represents current directory.

You can specify instead os.path.join('.', 'subdir', 'subsubdir') to list files in subdir/subsubdir/.

Complete code.

import pandas as pd
import os

filepaths = [f for f in os.listdir(os.path.join('.', 'subdir', 'subsubdir')) if f.endswith('.csv')]
df = pd.concat(map(pd.read_csv, filepaths)

CodePudding user response:

Just replace os.listdir(".") with os.listdir("csvFilesPath") as indicated in the previous answer it works. I tried it

  •  Tags:  
  • Related