Most computationally efficient way to count consecutive repeating values-CodePudding

Say I have a boolean array

a2= np.array([False, False, True, False, False, True, True, True, False, False])

I want an array which contains the sum of each group of consecutive True values

Desired result:

np.array([1, 3])

Current solution:

sums = []
current_sum = 0
prev = False
for boo in a2:
    if boo:
        current_sum =1
        prev = True
    if prev and not boo:
        sums.append(current_sum)
        current_sum = 0
    if not boo:
        prev = False
np.array(sums)

May not be the most computationally efficient. Seems like np.cumsum could be used in a creative manner but I am not able to think of a solution.

CodePudding user response：

You could use list comprehension with np.split np.flatnonzero:

l = [l.sum() for l in np.split(a2, np.flatnonzero(~a2)) if l.sum() > 0]

Output:

>>> l
[1, 3]

CodePudding user response：

Another solution with itertools

import numpy as np
import itertools

a2= np.array([False, False, True, False, False, True, True, True, False, False])

foo = [ sum( 1 for _ in group ) for key, group in itertools.groupby( a2 ) if key ]

print(foo)

output

[1, 3]

CodePudding user response：

Another way using np.where np.diff to identify the split locations:

out = [ar.sum() for ar in np.split(a2, np.where(np.diff(a2.astype(int))==1)[0] 1)[1:]]

Output:

[1, 3]