Given:
pd.DataFrame({'ranges': [2,4,6,2,4,1,2,5,3,2,3,5,6,3,2,1,4,6,3,2]})
I would like to have a new DataFrame where the 20 values (actually 1000 in my real case) get reduced to 4 values, each being the average (or other function) of the corresponding group of 5, so in other words:
average of (2,4,6 and 2), average of (4,1,2,5) etc.
It's like downsampling and it's related to binning. I am stumped. I bet it becomes a one-liner.
CodePudding user response:
You can floor divide the index and groupby it to find the mean of each group:
out = df.groupby(df.index//5)['ranges'].mean()
Output:
0 3.6
1 2.6
2 3.8
3 3.2
Name: ranges, dtype: float64
If the number of rows is divisible by the size of each group, we can use numpy:
out = df['ranges'].to_numpy().reshape(-1,5).mean(axis=1)
Output:
array([3.6, 2.6, 3.8, 3.2])
CodePudding user response:
Try this:
>>> df.groupby(df.index // 4)['ranges'].mean()
0 3.50
1 3.00
2 3.25
3 3.00
4 3.75
Name: ranges, dtype: float64
