Home > Software design >  python: write a dataframe groupby to a file
python: write a dataframe groupby to a file

Time:02-02

I have a file with the following that I am reading with python

Item Master Primary Spec/Common Information/Contract Number||Contract Master Primary Spec/cage code
8AND3||SP47W117D0015
8AND3||SP47W117D0015
8AND3||SP47W117D0015
8AND3||SP47W117D0015
8AND3||SP47W117D0015
8C1C2||N6247820D2401
8C1C2||N6247820D2401
8C1C2||N6247820D2401

I am trying to get a count of the number of contracts. The below code seems to work when I print it (although the header columns are reversed for some reason), but not when I try to output it to a file.

import pandas as pd

fname="mdm.export.item.master.delta.1335.20220120011500_125_125.csv"
fdir="./data/"
df = pd.read_csv(fdir fname, sep='\|\|', keep_default_na=False, engine='python')

uniqContract=df.groupby(['Item Master Primary Spec/Common Information/Contract Number']).count()
print(uniqContract)

file = open("testfile.txt","w") 
for items in uniqContract:
    file.writelines(items '\n')
file.close() 

This is the print output

(base) PS D:\02-MyLocalFiles> python .\helloworld.py
Contract Master Primary Spec/cage code Item Master Primary Spec/Common Information/Con...
8AND3                                                                                    5
8C1C2                                                                                    3
(base) PS D:\02-MyLocalFiles\python\backlog_report>

But this is the output to the file

Contract Master Primary Spec/cage code

What am I doing wrong?

CodePudding user response:

uniqContract=df.groupby(['Item Master Primary Spec/Common Information/Contract Number']).count().reset_index()
uniqContract.to_csv('testfile.txt', delimiter='\t')

You can call .reset_index() on your groupby count aggregation, then write that to a (text) csv file directly. You can specify the delimiter (this is tabs).

CodePudding user response:

uniqContract is a dataframe, and when you loop over a dataframe, it loops over the dataframes columns. uniqContract looks like this:

                                                    Contract Master Primary Spec/cage code
Item Master Primary Spec/Common Information/Con...                                        
8AND3                                                                                    5
8C1C2                                                                                    3

...which has only one column, so that's why you get Contract Master Primary Spec/cage code in your file.

CodePudding user response:

Since you're using pandas.dataFrame read_csv

Use pandas.dataFrame to_csv with no index column so that you don't get an extra id for every row.

uniqContract.to_csv("testfile.csv", index=False)

A link to the documentation for https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html

  •  Tags:  
  • Related