Home > Mobile >  XLSB to CSV with pandas, python
XLSB to CSV with pandas, python

Time:02-04

Trying to convert multiple XLSB files to CSV. Not sure what is the problem here

import os

import pandas as pd

path = r'C://Users//greencolor//Autoreport//Load_attachments//'
for filename in os.listdir(path):
    if filename.startswith("PB orders"):
        print(filename)                         #until here its working
        month = pd.read_excel(filename, sheet_name="Raw data ", engine="pyxlsb")
        print(month)                            # I get the error here
        month = month[month['Sales Manager'] == 'DEVON, JOHN'] #filtering by manager
        month.to_csv (path   filename   ".csv", index = None, header=True)

Error

FileNotFoundError: [Errno 2] No such file or directory: 'PB orders Dec.xlsb'

Why I get this error? print(filename) is printing all the XLSB files that name starts with PB orders

CodePudding user response:

filename is just the file's name, not the full path. You need to combine it with path to get the full path to the file. You can do that in a safe manner with os.path.join :

import os
...
for filename in os.listdir(path):
    if filename.startswith("PB orders"):
        full_path = os.path.join(path, filename)
        print(full_path )                         
        month = pd.read_excel(full_path , sheet_name="Raw data ", engine="pyxlsb")

Searching with a pattern

An alternative is to use glob to search for files that match a pattern. You still need to generate the full path:

import glob

...

for filename in glob.glob("PB orders*.xlsb", root_dir=path):
    full_path = os.path.join(path, filename)
    print(full_path )                         
    month = pd.read_excel(full_path , sheet_name="Raw data ", engine="pyxlsb")

Avoiding temp files

You still need to check the file name to avoid the temporary files generated when someone opens an Excel file (the files that start with ~) :

for filename in glob.glob("PB orders*.xlsb", root_dir=path):
    if not os.path.basename(filename).startswith("~"):
        full_path = os.path.join(path, filename)
            print(full_path )                         
            month = pd.read_excel(full_path , sheet_name="Raw data ", engine="pyxlsb")

CodePudding user response:

When you do month = pd.read_excel(filename, sheet_name="Raw data ", engine="pyxlsb") try to replace it with this:

month = pd.read_excel(path filename, sheet_name="Raw data ", engine="pyxlsb")

This will prepend the path to your filenames in the given directory.

  •  Tags:  
  • Related