I have a query and I would like to get only those rows who satisfies the first OR condition. I want to check conditions such as if the first OR will not have satisfy condition, then I should see the second OR and etc.
This is a simple example. However, these are really big tables with 30 millions of row. I cannot use full join. And my code should be put into function.
My sample data:
DECLARE @tbl_1 TABLE (Id INT)
DECLARE @tbl_2 TABLE (Id INT)
DECLARE @tbl_3 TABLE (Id INT)
DECLARE @tbl_4 TABLE (Id INT)
and its data:
INSERT INTO @tbl_1 ([Id]) VALUES
(1), (2), (3), (4), (5), (6)
INSERT INTO @tbl_2 ([Id]) VALUES (8)
INSERT INTO @tbl_3 ([Id]) VALUES (3)
INSERT INTO @tbl_4 ([Id]) VALUES (4)
SELECT * FROM @tbl_1 AS t
WHERE t.Id IN (SELECT Id FROM @tbl_2 AS t2)
OR t.Id IN (SELECT Id FROM @tbl_3 AS t2)
OR t.Id IN (SELECT Id FROM @tbl_4 AS t2)
Then I would like to see just one item:
3
But it gives: 3, 4
Or another example:
INSERT INTO @tbl_1 ([Id]) VALUES
(1), (2), (3), (4), (5), (6)
INSERT INTO @tbl_2 ([Id]) VALUES (1), (2)
INSERT INTO @tbl_3 ([Id]) VALUES (3)
INSERT INTO @tbl_4 ([Id]) VALUES (4)
SELECT * FROM @tbl_1 AS t
WHERE t.Id IN (SELECT Id FROM @tbl_2 AS t2)
OR t.Id IN (SELECT Id FROM @tbl_3 AS t2)
OR t.Id IN (SELECT Id FROM @tbl_4 AS t2)
Then I would like to see just two first items:
1, 2
But it gives: 1, 2, 3, 4
I've tried to use OR. However, it takes all items which satisfy condition.
How is it possible to write WHERE statement where just one first OR statement will be executed?
CodePudding user response:
You want a ranking. Rows from @tbl_2 have precedence over rows from table @tbl_3 and these again have precedence over rows from table @tbl_4.
Ideally you would just have one table instead of three and store the priority along:
INSERT INTO @ids (id, priority) VALUES (1, 1), (2, 1), (3, 2), (4, 3);
As we are looking for the best overall priority (in contrast to, say, looking for the best match per product or the like), we can join, order our rows by priority and use TOP WITH TIES to only keep the rows with the best priority.
SELECT TOP(1) WITH TIES t.*
FROM @tbl_1 AS t
JOIN @ids ids ON ids.id = t.id
ORDER BY ids.priority;
If you want to stick with the three tabes, then add the priority on-the-fly:
WITH ids AS
(
SELECT id, 1 AS priority FROM @tbl_2
UNION ALL
SELECT id, 2 AS priority FROM @tbl_3
UNION ALL
SELECT id, 3 AS priority FROM @tbl_4
)
SELECT TOP(1) WITH TIES t.*
FROM @tbl_1 AS t
JOIN ids ON ids.id = t.id
ORDER BY ids.priority;
CodePudding user response:
Try this:
SELECT * FROM tbl_1 AS t
WHERE
EXISTS(SELECT 1 FROM tbl_2 t2 WHERE t2.id = t.id)
AND
(SELECT COUNT(1) FROM tbl_1 t1
JOIN tbl_2 t2
ON t2.id = t1.id) > 0
OR (
EXISTS(SELECT 1 FROM tbl_3 t3 WHERE t3.id = t.id)
AND
(SELECT COUNT(1) FROM tbl_1 t1
JOIN tbl_2 t2
ON t2.id = t1.id) = 0
AND
(SELECT COUNT(1) FROM tbl_1 t1
JOIN tbl_3 t3
ON t3.id = t1.id) > 0
)
OR (
EXISTS(SELECT 1 FROM tbl_4 t4 WHERE t4.id = t.id)
AND
(SELECT COUNT(1) FROM tbl_1 t1
JOIN tbl_2 t2
ON t2.id = t1.id) = 0
AND
(SELECT COUNT(1) FROM tbl_1 t1
JOIN tbl_3 t3
ON t3.id = t1.id) = 0
AND
(SELECT COUNT(1) FROM tbl_1 t1
JOIN tbl_4 t4
ON t4.id = t1.id) > 0
)
In this way you have the OR exclusive
Sql Fiddle #1 (with dataset 1, 2; 3; 4)
Sql Fiddle #2 (with dataset 8; 3; 4)
CodePudding user response:
A TOP (1) WITH TIES with an ORDER BY over a CASE WHEN for the 3 table id's would do the job.
If nothing matched with @tbl_2 then the matches of @tbl_3 will come on top. And if no match with @tbl_2 & @tbl_3 then @tbl_4 will come on top.
DECLARE @tbl_1 TABLE (Id INT) DECLARE @tbl_2 TABLE (Id INT) DECLARE @tbl_3 TABLE (Id INT) DECLARE @tbl_4 TABLE (Id INT) INSERT INTO @tbl_1 ([Id]) VALUES (1), (2), (3), (4), (5), (6) INSERT INTO @tbl_2 ([Id]) VALUES (8) INSERT INTO @tbl_3 ([Id]) VALUES (3) INSERT INTO @tbl_4 ([Id]) VALUES (4) SELECT TOP (1) WITH TIES t.* FROM @tbl_1 AS t LEFT JOIN @tbl_2 t2 ON t2.Id = t.Id LEFT JOIN @tbl_3 t3 ON t3.Id = t.Id LEFT JOIN @tbl_4 t4 ON t4.Id = t.Id WHERE t.Id IN (t2.Id, t3.Id, t4.Id) ORDER BY CASE WHEN t2.Id IS NOT NULL THEN 2 WHEN t3.Id IS NOT NULL THEN 3 WHEN t4.Id IS NOT NULL THEN 4 END;
Id 3
DECLARE @tbl_1 TABLE (Id INT) DECLARE @tbl_2 TABLE (Id INT) DECLARE @tbl_3 TABLE (Id INT) DECLARE @tbl_4 TABLE (Id INT) INSERT INTO @tbl_1 ([Id]) VALUES (1), (2), (3), (4), (5), (6) INSERT INTO @tbl_2 ([Id]) VALUES (1), (2) INSERT INTO @tbl_3 ([Id]) VALUES (3) INSERT INTO @tbl_4 ([Id]) VALUES (4) SELECT TOP (1) WITH TIES t.* FROM @tbl_1 AS t LEFT JOIN @tbl_2 t2 ON t2.Id = t.Id LEFT JOIN @tbl_3 t3 ON t3.Id = t.Id LEFT JOIN @tbl_4 t4 ON t4.Id = t.Id WHERE t.Id IN (t2.Id, t3.Id, t4.Id) ORDER BY CASE WHEN t2.Id IS NOT NULL THEN 2 WHEN t3.Id IS NOT NULL THEN 3 WHEN t4.Id IS NOT NULL THEN 4 END;
Id 1 2
Test on db<>fiddle here
CodePudding user response:
Assuming t2, t3 & t4 have unique values. You would probably would like to verify that the execution plan is based on merge joins / hash joins and not nested loops.
select top 1 with ties
t1.*
from @tbl_1 AS t1
left join @tbl_2 as t2 on t2.id = t1.id
left join @tbl_3 as t3 on t3.id = t1.id
left join @tbl_4 as t4 on t4.id = t1.id
where coalesce(t2.id,t3.id,t4.id) is not null
order by case
when t2.Id is not null then 1
when t3.Id is not null then 2
when t4.Id is not null then 3
end
Another solution with no JOINs at all. Only GROUP BY :-)
select top 1 with ties
id
from ( select id, 999 from @tbl_1
union all select id, 1 from @tbl_2
union all select id, 2 from @tbl_3
union all select id, 3 from @tbl_4
) t (id,priority)
group by id
having max(priority) = 999
order by min(priority)
CodePudding user response:
One other idea you might entertain, although not exactly elegant, is a more procedural approach so you only end up touching a table if you need to which will be more performant if that's a concern.
Insert qualifying IDs into a temp table and only move on if there are no results at each stage, for example
create table #results (id int);
insert into #results
select id
from t1 where exists (select * from t2 where t2.id=t1.id)
if @@RowCount=0
begin
insert into #results
select id
from t1 where exists (select * from t3 where t3.Id=t1.Id)
if @@RowCount=0
begin
insert into #results
select id
from t1 where exists (select * from t4 where t4.Id=t1.Id)
end
end
select id
from #results
