Home > Enterprise >  SQL query interpretation
SQL query interpretation

Time:01-12

I have given an home assignment to explain the following query:

select
    date,
    country,
    count(*) as `result`
from (
    select
        user_id,
        event_date as date,
        any_value(geo.country) as country
    from `xxxx-android.events.all`
    where event_date between current_date - interval 2 day and current_date - interval 1 day
    group by user_id, date
)
group by date, country
order by date, count(*) desc

And I tried to replicate it in MySql with this online editor using the following code :

CREATE TABLE events (
  user_id INTEGER,
  event_date DATE,
  country TEXT -- NOT NULL
);
-- insert some values
INSERT INTO events VALUES (1, '2022-1-9', 'USA');
INSERT INTO events VALUES (1, '2022-1-9', 'USA');
INSERT INTO events VALUES (1, '2022-1-10', 'USA');
INSERT INTO events VALUES (2, '2022-1-9', 'UK');
INSERT INTO events VALUES (2, '2022-1-10', 'UK');
INSERT INTO events VALUES (2, '2022-1-9', 'UK');
INSERT INTO events VALUES (3, '2022-1-9', 'USA');
INSERT INTO events VALUES (3, '2022-1-10', 'USA');
INSERT INTO events VALUES (3, '2022-1-10', 'USA');
INSERT INTO events VALUES (4, '2022-1-9', 'AUT');
INSERT INTO events VALUES (4, '2022-1-10', 'AUT');
INSERT INTO events VALUES (5, '2022-1-10', 'AUT');
INSERT INTO events VALUES (5, '2022-1-10', 'AUT');
INSERT INTO events VALUES (6, '2022-1-9', NULL);
INSERT INTO events VALUES (6, '2022-1-9', NULL);

-- fetch some values
-- SELECT * FROM events;

-- select curdate() - interval 2 day;
-- select curdate() - interval 1 day;


select
    event_date,
    country,
    count(*) as 'result'
from (

-- begin internal query
select
    user_id as user_id,
    event_date as event_date,
    any_value(country) as country
from events
    where event_date between curdate() - interval 2 day and curdate() - interval 1 day
group by user_id, event_date
-- end begin internal query
 
) as whatever
group by event_date, country
order by event_date, count(*) desc

And I get the following table :

Output:

event_date  country result
2022-01-09  USA       2
2022-01-09  UK        1
2022-01-09  AUT       1
2022-01-09  NULL      1
2022-01-10  USA       2
2022-01-10  AUT       2
2022-01-10  UK        1

Shouldn't the result count if a user enters multiple times a day? For example the user with id 1 entered 2 times in '2022-01-09' but in the output was only counted as one.

CodePudding user response:

that is implied by the term in the Group BYin the inner SELECT

you group by user id and events

group by user_id, event_date

But

user_id as user_id,
event_date as event_date,
any_value(country) as country

wouldn't show any duplicate Values see manual to any_value

select
    user_id as user_id,
    event_date as event_date,
    any_value(country) as country
from events
    where event_date between curdate() - interval 2 day and curdate() - interval 1 day
group by user_id, event_date
user_id | event_date | country
------: | :--------- | :------
      1 | 2022-01-09 | USA    
      1 | 2022-01-10 | USA    
      2 | 2022-01-09 | UK     
      2 | 2022-01-10 | UK     
      3 | 2022-01-09 | USA    
      3 | 2022-01-10 | USA    
      4 | 2022-01-09 | AUT    
      4 | 2022-01-10 | AUT    
      5 | 2022-01-10 | AUT    
      6 | 2022-01-09 | null   

db<>fiddle here

that is why the count is only 2 and not 3 for the 09.09.2022

  •  Tags:  
  • Related