I have a file with the following that I am reading with python
Item Master Primary Spec/Common Information/Contract Number||Contract Master Primary Spec/cage code
8AND3||SP47W117D0015
8AND3||SP47W117D0015
8AND3||SP47W117D0015
8AND3||SP47W117D0015
8AND3||SP47W117D0015
8C1C2||N6247820D2401
8C1C2||N6247820D2401
8C1C2||N6247820D2401
I am trying to get a count of the number of contracts. The below code seems to work when I print it (although the header columns are reversed for some reason), but not when I try to output it to a file.
import pandas as pd
fname="mdm.export.item.master.delta.1335.20220120011500_125_125.csv"
fdir="./data/"
df = pd.read_csv(fdir fname, sep='\|\|', keep_default_na=False, engine='python')
uniqContract=df.groupby(['Item Master Primary Spec/Common Information/Contract Number']).count()
print(uniqContract)
file = open("testfile.txt","w")
for items in uniqContract:
file.writelines(items '\n')
file.close()
This is the print output
(base) PS D:\02-MyLocalFiles> python .\helloworld.py
Contract Master Primary Spec/cage code Item Master Primary Spec/Common Information/Con...
8AND3 5
8C1C2 3
(base) PS D:\02-MyLocalFiles\python\backlog_report>
But this is the output to the file
Contract Master Primary Spec/cage code
What am I doing wrong?
CodePudding user response:
uniqContract=df.groupby(['Item Master Primary Spec/Common Information/Contract Number']).count().reset_index()
uniqContract.to_csv('testfile.txt', delimiter='\t')
You can call .reset_index() on your groupby count aggregation, then write that to a (text) csv file directly. You can specify the delimiter (this is tabs).
CodePudding user response:
uniqContract is a dataframe, and when you loop over a dataframe, it loops over the dataframes columns. uniqContract looks like this:
Contract Master Primary Spec/cage code
Item Master Primary Spec/Common Information/Con...
8AND3 5
8C1C2 3
...which has only one column, so that's why you get Contract Master Primary Spec/cage code in your file.
CodePudding user response:
Since you're using pandas.dataFrame read_csv
Use pandas.dataFrame to_csv with no index column so that you don't get an extra id for every row.
uniqContract.to_csv("testfile.csv", index=False)
A link to the documentation for https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html
