Home > Enterprise >  Suppose an array contains only two kinds of elements, how to quickly find their boundaries?
Suppose an array contains only two kinds of elements, how to quickly find their boundaries?

Time:01-13

I've asked a similar question before, but this time it's different.

Since our array contains only two elements, we might as well set it to 1 and -1, where 1 is on the left side of the array and -1 is on the right side of the array:

[1,...,1,1,-1,-1,...,-1]

Both 1 and -1 exist at the same time and the number of 1 and -1 is not necessarily the same. Also, the numbers of 1 and -1 are both very large.

Then, define the boundary between 1 and -1 as the index of the -1 closest to 1. For example, for the following array:

[1,1,1,-1,-1,-1,-1]

Its boundary is 3.

Now, for each number in the array, I cover it with a device that you have to unlock to see the number in it.

I want to try to unlock as few devices as possible that cover 1, because it takes much longer to see a '1' than it takes to see a '-1'. And I also want to reduce my time cost as much as possible.

How can I search to get the boundary as quickly as possible?

CodePudding user response:

The problem is very like the "egg dropping" problem, but where a wrong guess has a large fixed cost (100), and a good guess has a small cost (1).

Let E(n) be the (optimal) expected cost of finding the index of the right-most 1 in an array (or finding that the array is all -1), assuming each possible position of the boundary is equally likely. Define the index of the right-most 1 to be -1 if the array is all -1.

If you choose to look at the array element at index i, then it's -1 with probability i/(n 1), and 1 with probability (n-i 1)/(n 1).

So if you look at array element i, your expected cost for finding the boundary is (1 E(i)) * i/(n 1) (100 E(n-i-1)) * (n-i 1)/(n 1).

Thus E(n) = min((1 E(i)) * i/(n 1) (100 E(n-i-1)) * (n-i 1)/(n 1), i=0..n-1)

For each n, the i that minimizes the equation is the optimal array element to look at for an array of that length.

I don't think you can solve these equations analytically, but you can solve them with dynamic programming in O(n^2) time.

The solution is going to look like a very skewed binary search for large n. For smaller n, it'll be skewed so much that it will be a traversal from the right.

CodePudding user response:

If I am right, a strategy to minimize the expectation of the cost is to draw at a fraction of the interval that favors the -1 outcome, in inverse proportion of the cost. So instead of picking the middle index, take the right centile.

But this still corresponds to a logarithmic asymptotic complexity.

There is probably nothing that you can do regarding the worst case.

  •  Tags:  
  • Related