Home > Enterprise >  binning series and returning innerbin sums not occurences as histogram
binning series and returning innerbin sums not occurences as histogram

Time:01-21

I hope title can clarify well the situation.

Lets say I have a series with the first 12 elements as shown below:

enter image description here

I'd like to have a histogram with specified bins, but I don't want to have number of occurences under bins. Instead I want a sum of the values under the bins. That is for this particular 12 elements, it should be in the form:

  • for 51-60 bin: 53 53 55 58 59 not 5
  • for 71-80 bin: 71 72 72 not 3
  • for 101-110 bin : 102 102 not 2
  • for 111-120 bin : 114 not 1
  • for other bins just 0.

I used cumsum but that was not the thing I'm looking for. For bins, I used np.arange(0,130,10) with pd.cut.

A similar questing is here but I couldn't tricked that into my solution.

CodePudding user response:

You can group by the intervals defined by pd.cut and then sum the actual values:

s = pd.Series([114, 102, 102, 72, 72, 71, 59, 58, 58, 55, 53, 53])

s.groupby(pd.cut(s, bins=np.arange(0, 130, 10))).sum()

(0, 10]         0
(10, 20]        0
(20, 30]        0
(30, 40]        0
(40, 50]        0
(50, 60]      336
(60, 70]        0
(70, 80]      215
(80, 90]        0
(90, 100]       0
(100, 110]    204
(110, 120]    114
dtype: int64
  •  Tags:  
  • Related