I have a dataframe where the column names are dates, and values are number of cases. However, there are 300 daily columns. How do I create a series where the index is the column names (dates), and values are the sum of number of days for each column?
Input:
Country 1/1/2020 1/2/2020 ... 12/31/2020
0 Afganistan 50 100 ... 500
1 Albania 20 50 ... 50
...
99 Zimbabwe 6 10 ... 5
Desired output (pd.Series):
1/1/2020 76
1/2/2020 160
...
12/31/2020 555
CodePudding user response:
Use DataFrame.sum with numeric_only=True parameter:
s = df.sum(numeric_only=True)
print (s)
1/1/2020 76
1/2/2020 160
12/31/2020 555
dtype: int64
CodePudding user response:
You can set_index to "Country" and sum:
out = df.set_index('Country').sum()
or drop "Country" and sum:
out = df.drop(columns='Country').sum()
Output:
1/1/2020 76
1/2/2020 160
12/31/2020 555
dtype: int64
CodePudding user response:
Transposing and summing should do the trick:
df.T.sum()
Where df is your DataFrame
