I created a simple dataset df with three columns, Area, Year_2010, Year_2020. The related code:
# Import pandas library
import pandas as pd
# initialize list of lists
data = [['Netherlands', 100, 200], ['Belgium', 15, 80], ['Germany', 125, 300]]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Area', 'Year_2010', 'Year_2020'])
# print dataframe.
df
#prints
Area Year_2010 Year_2020
0 Netherlands 100 200
1 Belgium 15 80
2 Germany 125 300
For two of the three columns (being Year_2010 and Year_2020), I would like to obtain the sum of all related rows. The output of the sum should appear in a new row for both columns.
Ideally, the output reads:
Area Year_2010 Year_2020
0 Netherlands 100 200
1 Belgium 15 80
2 Germany 125 300
3 240 580
However, if I apply
total = df.sum(axis=0)
df = df.append(total, ignore_index=True)
The Area column will also be aggregated, leading to concatenation of country names in the Area column:
Area Year_2010 Year_2020
0 Netherlands 100 200
1 Belgium 15 80
2 Germany 125 300
3 NetherlandsBelgiumGermany 240 580
How can I get the row sums for specific columns, while leaving out other columns?
As an alternative to leaving the value NetherlandsBelgiumGermany blank, the value can be replaced by Total.
CodePudding user response:
One solution could be to filter the columns by type and use indexing:
cols = df.select_dtypes('number').columns
df.loc[len(df), cols] = df[cols].sum()
updated DataFrame:
Area Year_2010 Year_2020
0 Netherlands 100.0 200.0
1 Belgium 15.0 80.0
2 Germany 125.0 300.0
3 NaN 240.0 580.0
Alternative if you want the empty string:
cols = df.select_dtypes('number').columns
df.loc[len(df)] = df[cols].sum().reindex(df.columns, fill_value='')
output:
Area Year_2010 Year_2020
0 Netherlands 100 200
1 Belgium 15 80
2 Germany 125 300
3 240 580
NB. this assumes a range index, if this is not the case, please update the example for alternative ways (using pandas.concat for example).
CodePudding user response:
You could do that manually:
df.at[3, "Area"] = "" # or whatever else you want
