I have given an home assignment to explain the following query:
select
date,
country,
count(*) as `result`
from (
select
user_id,
event_date as date,
any_value(geo.country) as country
from `xxxx-android.events.all`
where event_date between current_date - interval 2 day and current_date - interval 1 day
group by user_id, date
)
group by date, country
order by date, count(*) desc
And I tried to replicate it in MySql with this online editor using the following code :
CREATE TABLE events (
user_id INTEGER,
event_date DATE,
country TEXT -- NOT NULL
);
-- insert some values
INSERT INTO events VALUES (1, '2022-1-9', 'USA');
INSERT INTO events VALUES (1, '2022-1-9', 'USA');
INSERT INTO events VALUES (1, '2022-1-10', 'USA');
INSERT INTO events VALUES (2, '2022-1-9', 'UK');
INSERT INTO events VALUES (2, '2022-1-10', 'UK');
INSERT INTO events VALUES (2, '2022-1-9', 'UK');
INSERT INTO events VALUES (3, '2022-1-9', 'USA');
INSERT INTO events VALUES (3, '2022-1-10', 'USA');
INSERT INTO events VALUES (3, '2022-1-10', 'USA');
INSERT INTO events VALUES (4, '2022-1-9', 'AUT');
INSERT INTO events VALUES (4, '2022-1-10', 'AUT');
INSERT INTO events VALUES (5, '2022-1-10', 'AUT');
INSERT INTO events VALUES (5, '2022-1-10', 'AUT');
INSERT INTO events VALUES (6, '2022-1-9', NULL);
INSERT INTO events VALUES (6, '2022-1-9', NULL);
-- fetch some values
-- SELECT * FROM events;
-- select curdate() - interval 2 day;
-- select curdate() - interval 1 day;
select
event_date,
country,
count(*) as 'result'
from (
-- begin internal query
select
user_id as user_id,
event_date as event_date,
any_value(country) as country
from events
where event_date between curdate() - interval 2 day and curdate() - interval 1 day
group by user_id, event_date
-- end begin internal query
) as whatever
group by event_date, country
order by event_date, count(*) desc
And I get the following table :
Output:
event_date country result
2022-01-09 USA 2
2022-01-09 UK 1
2022-01-09 AUT 1
2022-01-09 NULL 1
2022-01-10 USA 2
2022-01-10 AUT 2
2022-01-10 UK 1
Shouldn't the result count if a user enters multiple times a day? For example the user with id 1 entered 2 times in '2022-01-09' but in the output was only counted as one.
CodePudding user response:
that is implied by the term in the Group BYin the inner SELECT
you group by user id and events
group by user_id, event_date
But
user_id as user_id,
event_date as event_date,
any_value(country) as country
wouldn't show any duplicate Values see manual to any_value
select user_id as user_id, event_date as event_date, any_value(country) as country from events where event_date between curdate() - interval 2 day and curdate() - interval 1 day group by user_id, event_dateuser_id | event_date | country ------: | :--------- | :------ 1 | 2022-01-09 | USA 1 | 2022-01-10 | USA 2 | 2022-01-09 | UK 2 | 2022-01-10 | UK 3 | 2022-01-09 | USA 3 | 2022-01-10 | USA 4 | 2022-01-09 | AUT 4 | 2022-01-10 | AUT 5 | 2022-01-10 | AUT 6 | 2022-01-09 | null
db<>fiddle here
that is why the count is only 2 and not 3 for the 09.09.2022
