So I have a table called optiontracking with ConID/TimeLog/Ask/Bid table that looks like the following:
| ConID | TimeLog | Ask | Bid |
|---|---|---|---|
| 1 | 2020-06-03 10:18:43 | 0.52 | 0.55 |
| 1 | 2020-06-03 10:20:41 | 0.55 | 0.42 |
| 1 | 2020-06-03 10:23:54 | 0.60 | 0.58 |
| 1 | 2020-06-03 10:26:26 | 0.52 | 0.51 |
| 1 | 2020-06-03 10:28:57 | 0.51 | 0.50 |
| 1 | 2020-06-04 10:25:57 | 0.65 | 0.64 |
| 1 | 2020-06-04 10:26:57 | 0.80 | 0.79 |
| 2 | 2020-06-03 10:18:43 | 1.36 | 1.33 |
| 2 | 2020-06-03 10:20:41 | 1.38 | 1.35 |
| 2 | 2020-06-03 10:23:54 | 1.33 | 1.32 |
| 2 | 2020-06-03 10:26:26 | 1.25 | 1.22 |
| 2 | 2020-06-03 10:28:57 | 1.20 | 1.19 |
What I am trying to get is for each ConID/TimeLog/Ask entry in this table, chronologically next Max(Bid), TimeLog of Max(Bid) while matching the same day and ConID.
I expect to repeat this for Max Ask (but I figure that will need 1 subquery for each group).
So far I have been able to extract the max Bid chronologically after each TimeLog/ask in the dataset using the query below. I want to be able to pull the corresponding timestamp where the MAX(Bid) was retrieved from, but I am hitting a wall without using a second subquery (which I am trying to avoid due to data size).
Here is the query I am using to get the current max Bid per each Ask Chronologically:
SELECT DISTINCT OT.[ConID], OT.[TimeLog], OT.[Ask], MAX(MaxBid.[Bid]) as SellPrice from
(
SELECT * from algotrading.[optiontracking] WHERE YEAR([TimeLog]) <= 2020 and Month([TimeLog]) <= 6
) as OT
LEFT JOIN
(
SELECT [ConID], [Bid], [TimeLog], CONVERT(date, [TimeLog]) as CombDate from algotrading.[optiontracking] WHERE YEAR([TimeLog]) <= 2020 and Month([TimeLog]) <= 6
) as MaxBid
on OT.[ConID] = MaxBid.[ConID] and MaxBid.[CombDate] = CONVERT(date, OT.[TimeLog]) and DATEDIFF(second, OT.[TimeLog], MaxBid.[TimeLog]) > 0
GROUP BY OT.[ConID], OT.[TimeLog], OT.[Ask], CONVERT(date, OT.[TimeLog])
I know there is a way to get the corresponding [TimeLog] for each Max([Bid]), but I can't quite seem to figure it out. The Year/Month filtering on each query is because the dataset is too large and I'm trying to do testing first.
Here is what I would expect as an output from the table above (including the TimeLog for each max bid listed as SellPriceTime below):
| ConID | TimeLog | Ask | SellPrice | SellPriceTime |
|---|---|---|---|---|
| 1 | 2020-06-03 10:18:43 | 0.52 | 0.58 | 2020-06-03 10:23:54 |
| 1 | 2020-06-03 10:20:41 | 0.55 | 0.58 | 2020-06-03 10:23:54 |
| 1 | 2020-06-03 10:23:54 | 0.60 | 0.51 | 2020-06-03 10:26:26 |
| 1 | 2020-06-03 10:26:26 | 0.52 | 0.50 | 2020-06-03 10:23:54 |
| 1 | 2020-06-03 10:28:57 | 0.51 | NULL | NULL |
| 1 | 2020-06-04 10:25:57 | 0.65 | 0.79 | 2020-06-04 10:26:57 |
| 1 | 2020-06-04 10:26:57 | 0.80 | NULL | NULL |
| 2 | 2020-06-03 10:18:43 | 1.36 | 1.35 | 2020-06-03 10:20:41 |
| 2 | 2020-06-03 10:20:41 | 1.38 | 1.32 | 2020-06-03 10:23:54 |
| 2 | 2020-06-03 10:23:54 | 1.33 | 1.22 | 2020-06-03 10:26:26 |
| 2 | 2020-06-03 10:26:26 | 1.25 | 1.19 | 2020-06-03 10:28:57 |
| 2 | 2020-06-03 10:28:57 | 1.20 | NULL | NULL |
CodePudding user response:
You can use an OUTER APPLY(SELECT TOP 1 ...) to select the max bid row for each ask.
An OUTER APPLY is like a left join to a subselect, where that subselect can apply WHERE conditions, ORDER BY and in this case TOP 1 to get your desired bid. It also allows you to select extract multiple values (Bid and TimeLog in this case), which is an advantage over a simple MAX() function.
I adjusted the "same day" logic to use a condition that checks for TimeLog < start of the next day. The overall date condition in the outer where clause has also been adjusted to be a simple compare. The original logic would have selected the first 6 months in each year 2020 and earlier.
For performance, make sure that you have an index on algotrading.optiontracking(ConID, TimeLog).
SELECT OT.ConID, OT.TimeLog, OT.Ask, MaxBid.SellPrice, MaxBid.SellPriceTime
FROM algotrading.optiontracking as OT
OUTER APPLY (
SELECT TOP 1 OT2.Bid AS SellPrice, OT2.TimeLog AS SellPriceTime
FROM algotrading.optiontracking OT2
WHERE OT2.ConID = OT.ConID
AND OT2.TimeLog > OT.TimeLog -- Later
AND OT2.TimeLog < DATEADD(day, 1, CONVERT(DATE, OT.TimeLog)) -- Before start of next day
ORDER BY OT2.Bid DESC -- Max Bid
) MaxBid
WHERE OT.TimeLog < '2020-07-01' -- Is this what was intended?
Results:
| ConID | TimeLog | Ask | SellPrice | SellPriceTime |
|---|---|---|---|---|
| 1 | 2020-06-03 10:18:43.000 | 0.52 | 0.58 | 2020-06-03 10:23:54.000 |
| 1 | 2020-06-03 10:20:41.000 | 0.55 | 0.58 | 2020-06-03 10:23:54.000 |
| 1 | 2020-06-03 10:23:54.000 | 0.60 | 0.51 | 2020-06-03 10:26:26.000 |
| 1 | 2020-06-03 10:26:26.000 | 0.52 | 0.50 | 2020-06-03 10:28:57.000 |
| 1 | 2020-06-03 10:28:57.000 | 0.51 | null | null |
| 1 | 2020-06-04 10:25:57.000 | 0.65 | 0.79 | 2020-06-04 10:26:57.000 |
| 1 | 2020-06-04 10:26:57.000 | 0.80 | null | null |
| 2 | 2020-06-03 10:18:43.000 | 1.36 | 1.35 | 2020-06-03 10:20:41.000 |
| 2 | 2020-06-03 10:20:41.000 | 1.38 | 1.32 | 2020-06-03 10:23:54.000 |
| 2 | 2020-06-03 10:23:54.000 | 1.33 | 1.22 | 2020-06-03 10:26:26.000 |
| 2 | 2020-06-03 10:26:26.000 | 1.25 | 1.19 | 2020-06-03 10:28:57.000 |
| 2 | 2020-06-03 10:28:57.000 | 1.20 | null | null |
See this db<>fiddle.
CodePudding user response:
I don't fully understand what you're trying to do but this seems like a case for RANK() and PARTITION BY.
This is not a full answer but with
SELECT ConnId, TimeLog, LastPrice, Ask, Bid,
,Max(LastPrice) OVER (PARTITION BY ConnId) AS MaxLastPrice
,RANK() OVER (PARTITION BY ConnId ORDER BY Bid DESC) AS Rank
from ##Test
you'd get:
ConnId TimeLog LastPrice Ask Bid MaxLastPrice Rank
1 2020-06-04 10:26:57.0000000 10.48 0.80 0.79 10.48 1
1 2020-06-04 10:25:57.0000000 10.48 0.65 0.64 10.48 2
1 2020-06-03 10:23:54.0000000 10.45 0.60 0.58 10.48 3
1 2020-06-03 10:18:43.0000000 10.40 0.52 0.55 10.48 4
1 2020-06-03 10:26:26.0000000 10.42 0.52 0.51 10.48 5
1 2020-06-03 10:28:57.0000000 10.48 0.51 0.50 10.48 6
1 2020-06-03 10:20:41.0000000 10.30 0.55 0.42 10.48 7
2 2020-06-03 10:20:41.0000000 12.05 1.38 1.35 12.15 1
2 2020-06-03 10:18:43.0000000 12.10 1.36 1.33 12.15 2
2 2020-06-03 10:23:54.0000000 12.03 1.33 1.32 12.15 3
2 2020-06-03 10:26:26.0000000 12.13 1.25 1.22 12.15 4
2 2020-06-03 10:28:57.0000000 12.15 1.20 1.19 12.15 5
You can now order by whichever column you wish and pick the records with rank 1, for example.
Here's my test data:
CREATE TABLE ##test (
ConnId INT,
TimeLog DATETIME2,
LastPrice Numeric(5,2),
Ask Numeric(5,2),
Bid Numeric(5,2)
);
Insert into ##test values
(1, '2020-06-03 10:18:43 ', 10.40, 0.52, 0.55),
(1, '2020-06-03 10:20:41 ', 10.30, 0.55, 0.42),
(1, '2020-06-03 10:23:54 ', 10.45, 0.60, 0.58),
(1, '2020-06-03 10:26:26 ', 10.42, 0.52, 0.51),
(1, '2020-06-03 10:28:57 ', 10.48, 0.51, 0.50),
(1, '2020-06-04 10:25:57 ', 10.48, 0.65, 0.64),
(1, '2020-06-04 10:26:57 ', 10.48, 0.80, 0.79),
(2, '2020-06-03 10:18:43 ', 12.10, 1.36, 1.33),
(2, '2020-06-03 10:20:41 ', 12.05, 1.38, 1.35),
(2, '2020-06-03 10:23:54 ', 12.03, 1.33, 1.32),
(2, '2020-06-03 10:26:26 ', 12.13, 1.25, 1.22),
(2, '2020-06-03 10:28:57 ', 12.15, 1.20, 1.19)
CodePudding user response:
I was able to solve this with the way I knew how, and it is faster than the current "better" solution. I would much rather use @T N's answer which is easier to understand and seems way more flexible, but this is query is considerably faster (3.5 mins vs. 8 mins).
For some extra information:
- There is roughly 8 million records in this table for data from 6/1/2020 to 7/1/2020
- There were no indexes created yet to help with any of these queries.
Any guidance as to why this is a significantly faster query than @T N's answer?
SELECT Prof.[ConID], Prof.[PurchaseTime], Prof.[Ask], Prof.[MaxPriceAfterPurchase], Prof.[MaxPriceAfterPurchase]-Prof.[Ask] as MaxProfit, MIN(SellTimeMax.TimeLog) as SellAtMaxTime
FROM
(
SELECT DISTINCT OT.[ConID], OT.[TimeLog] as [PurchaseTime], OT.[Ask], MAX(MaxBid.[Bid]) as MaxPriceAfterPurchase
from (
SELECT * from algotrading.[optiontracking] WHERE TimeLog < '2020-07-01'
) OT
LEFT JOIN (
SELECT [ConID], [Bid], [TimeLog], CONVERT(date, [TimeLog]) as CombDate from algotrading.[optiontracking] WHERE TimeLog < '2020-07-01'
) MaxBid
on OT.[ConID] = MaxBid.[ConID] and MaxBid.[CombDate] = CONVERT(date, OT.[TimeLog]) and DATEDIFF(second, OT.[TimeLog], MaxBid.[TimeLog]) > 0
GROUP BY OT.[ConID], OT.[TimeLog], OT.[Ask]
) Prof
LEFT JOIN
(
SELECT [ConID], [Bid], [TimeLog], CONVERT(date, [TimeLog]) as CombDate from algotrading.[optiontracking] WHERE TimeLog < '2020-07-01'
) SellTimeMax
ON SellTimeMax.[ConID] = Prof.ConID and SellTimeMax.CombDate = CONVERT(date, Prof.[PurchaseTime]) and SellTimeMax.[Bid] = Prof.[MaxPriceAfterPurchase] and DATEDIFF(second, Prof.[PurchaseTime], SellTimeMax.[TimeLog]) > 0
