Home > Software engineering >  Converting Nested Subqueries into Mini Queries
Converting Nested Subqueries into Mini Queries

Time:02-08

I have a lot of trouble reading nested subqueries - I personally prefer to write several mini queries and work from there. I understand that more advanced SQL users find it more efficient to write nested subqueries.

For instance, in the following query:

select distinct b.table_b, a.*
from table_a a
inner join table_c b
on a.id_1 = b.id_1
inner join ( select a.id_1, max(a.var_1) as max_var_1 from table_a a
group by a.id_1) c
on a.id_1 = b.id_1 and a.var_1 = c.max_var_1

Problem: I am trying to turn this into several different queries:

#PART 1 :

create table_1 as select distinct b.table_b, a.*
from table_a a
inner join table_c b
on a.id_1 = b.id_1

#PART 2:

 create table_2 as select a.id_1, max(a.var_1) as max_var_1 from table_a a
group by a.id_1

#PART 3 (final result: final_table)

create final_table as select a.*, b.*
from table_1 a
inner join table_2 b
on a.id_1 = b.id_1 and a.var_1 = b.max_var_1

My Question: Can someone please tell me if this is correct? Is this how the above nested subquery can be converted into 3 mini queries?

Thanks!

CodePudding user response:

Subqueries are only inserted into separate tables when you use them multiple times. And yet, if the result of the subquery returns many records, then it is not recommended to insert them separately into the table. Because when you are using only select DB will only read data from the disc, but when using insert command, DB will write to disc. Inserting many records may be long process than selecting.

P.S. Mostly used "create temporary table" when inserting subquery process.

Another good way is to use "CTE (Common Table Expression)". When using "CTE", the database stores the results of "SELECT" queries in RAM, executing the subquery only once. If then subqueries are then used multiple times, the database only uses the results from RAM (not executing).

CodePudding user response:

For the performance of your query, you can use only #PART2, and others should not be used. They are unnecessary. But for better performance, I recommended you write your query without inserting, using CTE. For example:

with sub_query as (
    select 
        id_1, 
        max(var_1) as max_var_1 
    from 
        table_a 
    group by id_1
)
select distinct b.table_b, a.*
from table_a a
inner join table_c b
on a.id_1 = b.id_1
inner join sub_query c
on a.id_1 = b.id_1 and a.var_1 = c.max_var_1;
  •  Tags:  
  • Related