My table is like this:
| root_tstamp | userId |
|---|---|
| 2022-01-26T00:13:24.725 00:00 | d2212 |
| 2022-01-26T00:13:24.669 00:00 | ad323 |
| 2022-01-26T00:13:24.629 00:00 | adfae |
| 2022-01-26T00:13:24.573 00:00 | adfa3 |
| 2022-01-26T00:13:24.552 00:00 | adfef |
| ... | ... |
| 2021-01-26T00:12:24.725 00:00 | d2212 |
| 2021-01-26T00:15:24.669 00:00 | daddfe |
| 2021-01-26T00:14:24.629 00:00 | adfda |
| 2021-01-26T00:12:24.573 00:00 | 466eff |
| 2021-01-26T00:12:24.552 00:00 | adfafe |
I want to get the number of users in the current year and in previous year like below using SQL.
Date Users previous_year
2022-01-01 10 5
2022-01-02 20 15
The code is written as follows.
select CAST(root_tstamp as DATE) as Date,
count(DISTINCT userid) as users,
count(Distinct case when CAST(root_tstamp as DATE) = dateadd(MONTH,-12,CAST(root_tstamp as DATE)) then userid end) as previous_year
FROM table1
But it returns 0 for previous_year values. How can I fix that?
CodePudding user response:
Possible solution for SQL Server:
WITH cte AS ( SELECT 2022 [year]
UNION ALL
SELECT 2021 )
SELECT cte.[year],
COUNT(DISTINCT test.userId) current_users_amount,
COUNT(DISTINCT CASE WHEN YEAR(test.root_tstamp) < cte.[year]
THEN test.userId
END) previous_users_amount
FROM test
JOIN cte ON YEAR(test.root_tstamp) <= cte.[year]
GROUP BY cte.[year]
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=88b78aad9acd965bdbac4c85a0b81927
CodePudding user response:
This query (for MySql) returns unique number of userids where the root_timestamp is in the current year, by day, and the number of unique userids for the same day last year. If there is no record for a day in the current year nothing will be displayed for that day. If there are rows for the current year, but no rows for the same day last year, then NULL will be shown for that lastyear column.
SELECT cast(ty.root_tstamp as date) as Dte,
COUNT(DISTINCT ty.userId) as users_this_day,
count(distinct lysd.userid) as users_sameday_lastyear
FROM test ty
left join
test lysd
on cast(lysd.root_tstamp as date)=date_add(cast(ty.root_tstamp as date), interval -1 year)
WHERE YEAR(ty.root_tstamp) = year(current_date())
GROUP BY Dte
If you wish to show output rows for calendar days even if there are no rows in current year and/or last year, then you also need a calendar table to be introduced (let's hope that it is not what you need)
