How to do this in Mysql to get all users even no records or absent on that selected date range? attendance_tbl
----------------------------------------------------------------------------------------
| ID | user_id | time_in | time_out | created_at |
----------------------------------------------------------------------------------------
| 1 | 001 | 2022-01-01 08:00:00 | 2022-01-01 17:00:00 | 2022-01-03 08:00:00 |
| 2 | 002 | 2022-01-01 08:15:24 | 2022-01-01 17:00:00 | 2022-01-03 08:15:24 |
| 3 | 003 | 2022-01-02 08:44:55 | 2022-01-02 17:00:00 | 2022-01-04 08:44:55 |
| 4 | 004 | 2022-01-03 08:40:22 | 2022-01-03 17:00:00 | 2022-01-04 08:40:22 |
----------------------------------------------------------------------------------------
users_tbl
----------------------------------
| ID | user_id | f_name |
----------------------------------
| 1 | 001 | John Doe |
| 2 | 002 | Jane Doe |
| 3 | 003 | Ronal Black |
| 4 | 004 | Lucy White |
----------------------------------
Expected Output Daterange : from 2022-01-01 to 2022-01-03 Will get all the Users Fullname
----------------------------------------------------------------------------------------------------------------------------
| ID | user_id | Date | f_name | time_in | time_out | created_at |
----------------------------------------------------------------------------------------------------------------------------
| 1 | 001 | Jan 1 2022 | John Doe | 2022-01-01 08:00:00 | 2022-01-01 17:00:00 | 2022-01-03 08:00:00 |
| 2 | 002 | Jan 1 2022 | Jane Doe | 2022-01-01 08:15:24 | 2022-01-01 08:15:24 | 2022-01-03 08:00:00 |
| 3 | 003 | Jan 1 2022 | Ronal Black | | | |
| 4 | 004 | Jan 1 2022 | Lucy White | | | |
| 5 | 001 | Jan 2 2022 | John Doe | | | |
| 6 | 002 | Jan 2 2022 | Jane Doe | | | |
| 7 | 003 | Jan 2 2022 | Ronal Black | 2022-01-02 17:00:00 | 2022-01-02 17:00:00 | 2022-01-02 17:00:00 |
| 8 | 004 | Jan 2 2022 | Lucy White | | | |
| 9 | 001 | Jan 3 2022 | John Doe | | | |
| 10 | 002 | Jan 3 2022 | Jane Doe | | | |
| 11 | 003 | Jan 3 2022 | Ronal Black | | | |
| 12 | 004 | Jan 3 2022 | Lucy White | 2022-01-04 17:00:00 | 2022-01-04 17:00:00 | 2022-01-04 17:00:00 |
----------------------------------------------------------------------------------------------------------------------------
CodePudding user response:
Check this. In here I call the attendance_tbl twice, one for creating a list of date and users and the other for fetching the data (time in and time out). And by using BETWEEN as @nnichols suggested to filter the selected range you prefer which I just realized earlier.
select u.`user_id`, date(a.time_in) as `date`, u.`f_name`, b.`time_in`, b.`time_out`, b.created_at from attendance_tbl a
join users_tbl u
left join attendance_tbl b on b.`user_id`=u.`user_id` and date(b.`time_in`)=date(a.`time_in`)
WHERE DATE(a.time_in) BETWEEN '2022-01-01' AND '2022-01-31'
GROUP BY `date`, u.user_id;
RESULT
user_id date f_name time_in time_out created_at
------- ---------- ----------- ------------------- ------------------- ---------------------
001 2022-01-01 John Doe 2022-01-01 08:00:00 2022-01-01 17:00:00 2022-01-03 08:00:00
002 2022-01-01 Jane Doe 2022-01-01 08:15:24 2022-01-01 17:00:00 2022-01-03 08:15:24
003 2022-01-01 Ronal Black (NULL) (NULL) (NULL)
004 2022-01-01 Lucy White (NULL) (NULL) (NULL)
001 2022-01-02 John Doe (NULL) (NULL) (NULL)
002 2022-01-02 Jane Doe (NULL) (NULL) (NULL)
003 2022-01-02 Ronal Black 2022-01-02 08:44:55 2022-01-02 17:00:00 2022-01-04 08:44:55
004 2022-01-02 Lucy White (NULL) (NULL) (NULL)
001 2022-01-03 John Doe (NULL) (NULL) (NULL)
002 2022-01-03 Jane Doe (NULL) (NULL) (NULL)
003 2022-01-03 Ronal Black (NULL) (NULL) (NULL)
004 2022-01-03 Lucy White 2022-01-03 08:40:22 2022-01-03 17:00:00 2022-01-04 08:40:22
For the ID column just create a table with AUTO_INCREMENT id and insert your selected data.
To format your date (if you really really need to) like the one in your example result, just change the DATE(a.time_in) to DATE_format(a.time_in, '%b %d %Y').
CodePudding user response:
Given that you want to include the absent data we start by using a cross join to build the full set of dates and users. You do not need to use a derived table but I have done it that way for clarity (the performance over head is negligible) -
SELECT *
FROM (
SELECT DISTINCT DATE(`time_in`) `date`
FROM `attendance_tbl`
WHERE `time_in` BETWEEN '2022-01-01 00:00:00' AND '2022-01-30 23:59:59'
) d
INNER JOIN users_tbl u
By joining between these two tables we get the cartesian product (all combinations of the two sets). We then just take it a step further by using a left join to the attendance data -
SELECT
u.user_id,
DATE_FORMAT(d.date, '%b %d %Y') `date`,
u.f_name,
MIN(a.time_in) AS `time_in`,
MAX(a.time_out) AS `time_out`
FROM (
SELECT DISTINCT
DATE(`time_in`) `date`,
TIMESTAMP(DATE(`time_in`), '00:00:00') `begin`,
TIMESTAMP(DATE(`time_in`), '23:59:59') `end`
FROM `attendance_tbl`
WHERE `time_in` BETWEEN '2022-01-01 00:00:00' AND '2022-01-30 23:59:59'
) d
INNER JOIN `users_tbl` u
LEFT JOIN `attendance_tbl` a
ON `u`.`user_id` = `a`.`user_id`
AND `a`.`time_in` BETWEEN `d`.`begin` AND `d`.`end`
GROUP BY d.date, u.user_id
ORDER BY d.date, u.user_id;
If your attendance_tbl can only have 0 or 1 row per user per day then you can drop the GROUP BY and aggregate_functions in the select list.
You will notice that I have added begin and end to the derived table. This is to allow for index use for the join. This is not important while the attendance_tbl is small but will matter more as the table grows. Adding an index on (user_id, time_in) will make a huge difference to performance in the longer term.
N.B. This does not handle holes in the date range. Please reply to my previous comments by adding the details to your question if this is a requirement and update your examples to cover the scenario.
