I have a series of prices throughout the day. I need to find whether the price at each time point will first go up by 50% or drop by 50% by the end of the day.
Currently I can only think about using brute force -- compare the current price with each of the following prices until I meet the above requirement or go through all the prices in the day.
Are there any good methods to solve the problem? Thank you very much in advance!
there are six price movements by the end of the day.
price=[1, 1.3, 0.9, 1.5, 0.65, 3]
label=[1, -1, 1, -1, 1]
Explanation:
at t0, the price is 1 and it first goes up to 1.5 rather than down to 0.5, so the label is 1.
at t1, the price is 1.3 and it first drops down to 0.65 rather than up to 1.95, so the label is -1.
at t2, the price is 0.9 and it first goes up to 1.5 (0.9*1.5=1.35) rather than down to 0.45(0.9*0.5), so the label is 1.
at t3, the price is 1.5 and it first drops down to 0.65 (1.5*0.5=0.75) rather than up to 3, so the label is -1.
at t4, the price is 0.65 and it first goes up to 3 (0.65*1.5=0.975) rather than down to 0.325 (0.65*0.5), so the label is 1.
If the price change lies within -50% - 50%, the label will be 0.
CodePudding user response:
I think you can do better than the naive algorithm. However: Much of this depends on the amount of data you have and the implementation. Remember, that an implementation with an O(n * n) algorithm can be faster than one with an O(n) algorithm, depending in particular on the input size (n) and the constant factors from the implementation. For that reason, the first thing you have to do is to define benchmarks.
Anyhow, I'd rather approach this the other way around. Instead of searching for a label for each price in turn, combine a new price with the existing ones that don't have a label yet. A rough outline of this algorithm would go like this:
- Take a new price from the sequence of prices.
- Label all unlabelled previous prices with -1 where the previous price is at least twice this price.
- Label all unlabelled previous prices with 1 where the previous price is at most two thirds this price.
- Mark the new price as unlabelled.
- Repeat until no more new prices are available.
- Label all remaining unlabelled prices with 0.
In the implementation, store the unlabelled prices in a sorted container. That makes it easy to find prices to label with -1 or 1, because they are at either end of the container.
Applying this to your example data would go like this:
unlabelled=[]- handle
1@t0unlabelled=[1@t0]
- handle
1.3@t1unlabelled=[1@t0, 1.3@t1]
- handle
0.9@t2unlabelled=[0.9@t2, 1@t0, 1.3@t1]
- handle
1.5@t3- label
0.9@t2with 1 - label
1@t0with 1 unlabelled=[1.3@t1, 1.5@t3]
- label
- handle
0.65@t4- label `1.3@t1 with -1
- label
1.5@t3with -1 unlabelled=[0.65@t4]
- handle
3@t5- label
0.65@t4with 1 unlabelled=[3@t5]
- label
- label
3@t5with 0
Notes:
- The general advise is to consider sorting. If you have an algorithm that can be improved by sorting, consider sorting as a previous step. This general rule doesn't apply directly, only indirectly by keeping the list of unlabelled items sorted.
- Much of this depends on the container you use for the unlabelled items. I'd try a linked list, an array and a tree. Also, some other data structures are possible that use the fact that the price is a scalar. Still, as already mentioned, you need benchmarks!
