numpy.searchsorted(a, v, ...) basically finds the indices of the first elements in a sorted array a that are larger than or equal to the elements in v. I assume that this is faster than a method that does not take advantage of the fact that a is sorted.
Now what if v is also sorted? Certainly it should be possible to exploit this knowledge to make the algorithm faster? (Perhaps I'm wrong and the design of searchsorted makes this irrelevant; if so please excuse my question.)
So the question is: What would be the fastest way to do the same as searchsorted(a, v) but for both a and v sorted?
CodePudding user response:
Off the top of my head, I would assume that you could solve the problem thusly:
def searchsorted_sorted(x, a):
y = np.empty_like(a)
l = 0
for (i, s) in enumerate(a):
pos = l np.searchsorted(x[l:], s)
y[i] = pos
l = y[i]
return y
So, you could use the knowledge of v being ordered to restrict the search space for subsequent searches.
Since numpy likely uses binary search, the difference should be more pronounced when the values in v are cluttered towards the end of a. If you have a large number M of entries and only search for the indices of the last n, the second to n-th search restrict the length to be searched from M to M - n. Thus, rather than taking O(n*log(M)) you would only need O(n*log(n)), which in theory makes a difference. Caveats:
- Maybe
numpyalready does something similar? - Since
numpyis implemented in C, it is not a fair comparison with this function (partly) written in python. - This implementation is rather cache-unfriendly, further reducing its usefulness.
CodePudding user response:
When a and v have approximately the same size, the best algorithm consist in merging the two arrays (see: merge algorithm) (note you only need the indices and not the output merged array). This can be done in linear time (ie. O(n M)) instead of a quasi-linear O(n log M) time.
Otherwise the problem can be solved in O(n log log M) time instead of O(n log n) time. This is only interesting if a is huge since log M is already quite small and one should be careful about hidden constants in algorithmic complexities. The algorithm is not trivial to implement. Here is the idea:
- Pick the middle value
v[k]ofvand split virtuallyvin two part (left and right ones) - Perform a binary search of
v[k]inaso to get its locationp - Now we can be sure that the items in the left part of
vare located ina[:p 1]and the right part are ina[p:](one side can be excluded in case of equality but let put this aside for sake of simplicity) - Recursively apply the operation by calling
search(a[:p 1], v[:k])andsearch(a[p:], v[k 1:])
Note that for the second algorithm, if v contains i multiple same values, then you can copy the location found i times and find a more strict lower/upper bound in a so to avoid degenerated case possibly resulting in a worse complexity.
Because of the CPython loop overhead and the one of Numpy function calls, this implementation needs to be implemented in native languages like C/C or in Cython or possibly using JITs like Numba.
