I want to add new rows in DataFrame pandas each time I run the program I create. I don't know the data in advance, the functions are supposed to put the data in a variable and I want to add these variables in a row. For now I just success to add one row, but when I run the program each time this row is replace by the next one. I don't want the row to be replaced but added in the next row.
net_index = mylist.index('NET PAYE EN EUROS ')
net= mylist[net_index 2]
total_index= mylist.index('CONGES ')
total = (mylist[total_index-1])
df = pd.DataFrame(columns=['Mois','Nom','Adresse','Net_payé','Total_versé'])
new = {'Mois': mois, 'Nom': nom, 'Adresse': adresse,'Net_payé':net, 'Total_versé':total}
df= df.append(new, ignore_index=True)
This is a part of my code. First I create an empty Dataframe with name of columns, and then a dict with variables which are supposed to change for each run.
This is the result I have, but each time I run, the rows is replace by the next one, and not add
I suppose I have to do a loop, but it never works well, I search everywhere for a solution but don't find one.
So do you know what can I do ? Thank you so much
CodePudding user response:
Apparently, you are not saving the dataframe anywhere. Once your program exits, all data and variables are erased (lost). You cannot retrieve data from a previous run. The solution is to save the dataframe into a file before exiting your program. Then for each run, load the previous data from file.
CodePudding user response:
Actually yes I save the dataframe in a csv file. Because my goal is to implement the variables's results in a csv. But the result is the same as I show before, always take the first row and replace it, not add new one.
df = pd.DataFrame(columns=['Mois','Nom', 'Adresse','Net_payé','Total_versé'])
new = {'Mois': mois, 'Nom': nom, 'Adresse': adresse,'Net_payé':net, 'Total_versé':total}
df =df.append(new, ignore_index=True)
df.to_csv('test.csv', header=True, index=False, encoding='utf-8')
Thanks for your reply!
CodePudding user response:
There are multiple ways to add rows to an existing DataFrame. One way is to use pd.concat, of which the df.append function on the last code line in your questions is a specific use case.
However, the method I prefer is to create a nested list that contains my data, and then create a new DataFrame from scratch. Like this:
# Each list within this list contains your data in the order of the
# columns you specified: ['Mois','Nom','Adresse','Net_payé','Total_versé']
data = [[1,2,3,4,5], [8,9,10,11,12]]
# Create a new DataFrame, using the *data* variable and your column names
df = pd.DataFrame(data=data, columns=['Mois','Nom','Adresse','Net_payé','Total_versé'])
Which results in:
Mois Nom Adresse Net_payé Total_versé
0 1 2 3 4 5
1 8 9 10 11 12
