I have a table called 'users' that has the following structure:
| id (PK) | campaign_id | createdAt |
|---|---|---|
| 1 | 123 | 2022-07-14T10:30:01.967Z |
| 2 | 1234 | 2022-07-14T10:30:01.967Z |
| 3 | 123 | 2022-07-14T10:30:01.967Z |
| 4 | 123 | 2022-07-14T10:30:01.967Z |
At the same time I have a table that tracks clicks per user:
| id (PK) | user_id(FK) | createdAt |
|---|---|---|
| 1 | 1 | 2022-07-14T10:30:01.967Z |
| 2 | 2 | 2022-07-14T10:30:01.967Z |
| 3 | 2 | 2022-07-14T10:30:01.967Z |
| 4 | 2 | 2022-07-14T10:30:01.967Z |
Both of these table are up to millions of records... I need the most efficient query to group the data per campaign_id.
The result I am looking for would look like this:
| campaign_id | total_users | total_clicks |
|---|---|---|
| 123 | 3 | 1 |
| 1234 | 1 | 3 |
I unfortunately have no idea how to achieve this while minding performance and most important of it all I need to use WHERE or HAVING to limit the query in a certain time range by createdAt
CodePudding user response:
Note, PostgreSQL is not my forte, nor is SQL. But, I'm learning spending some time on your question. Have a go with INNER JOIN after two seperate SELECT() statements:
SELECT * FROM
(
SELECT campaign_id, COUNT (t1."id(PK)") total_users FROM t1 GROUP BY campaign_id
) tbl1
INNER JOIN
(
SELECT campaign_id, COUNT (t2."user_id(FK)") total_clicks FROM t2 INNER JOIN t1 ON t1."id(PK)" = t2."user_id(FK)" GROUP BY campaign_id
) tbl2
USING(campaign_id)
See an online fiddle. I believe this is now also ready for a WHERE clause in both SELECT statements to filter by "createdAt". I'm pretty sure someone else will come up with something better.
Good luck.
CodePudding user response:
Hope this will help you.
select u.campaign_id,
count(distinct u.id) users_count,
count(c.user_id) clicks_count
from
users u left join clicks c on u.id=c.user_id
group by 1;
