Let's say I have a double like 3.5 and I would like to find out where to sort it in an existing sorted vector say seq(1, 10), put differently, which index the new number would take in the vector. Of course it sits somewhere between 3 and 4 and hence between the third and fourth index, but what would be the fastet way to arrive at this result?
CodePudding user response:
As mentioned in the comments, findInterval is fastest for this purpose. Even a very simple loop in C that does the same thing is a little slower on average.
library(Rcpp)
cppFunction("int find_index(double x, NumericVector v) {
int len = v.size();
for(int i = 0; i < len; i) {
if(x <= v[i]) return i 1;
}
return NA_INTEGER;
}")
microbenchmark::microbenchmark(
findInterval = findInterval(453993.5, 1:1000000),
find_index = find_index(453993.5, 1:1000000)
)
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> findInterval 1.9646 2.1739 2.996931 2.32375 2.4846 37.4218 100
#> find_index 2.2151 2.4502 11.319199 2.60925 2.9800 337.9229 100
CodePudding user response:
Something like this?
First define dbl and my_seq
then concatenate both with c(dbl, my_seq) and wrap it with sort
then define the index with which(my_vec == dbl):
dbl <- 3.5
my_seq <- seq(1,10)
my_vec <- sort(c(dbl, my_seq))
index <- which(my_vec == dbl)
index
output:
[1] 4
