I need some useful tutorial links to understand how to create a query which will bring in only incremental data on a weekly basis. This data will be pulled in incrementally using date & time column.
CodePudding user response:
lets say you have a table customer
create table customer(id int, name varchar(10), age int, created_on date, last_modified_on datetime)
create table customer_inc(id int, name varchar(10), age int, created_on date, last_modified_on datetime)
insert into customer(id,name,age,created_on,last_modified_on )
values(1,'a',54,'2022-01-01','2022-01-01 00:00:00.000')
values(2,'a',52,'2022-01-01','2022-01-01 00:00:00.000')
values(3,'a',53,'2022-01-01','2022-01-01 00:00:00.000')
| id | name | age | created_on | last_modified_on |
|---|---|---|---|---|
| 1 | a | 54 | 2022-01-01 | 2022-01-01 00:00:00.000 |
| 2 | b | 52 | 2022-01-01 | 2022-01-01 00:00:00.000 |
| 3 | c | 53 | 2022-01-01 | 2022-01-01 00:00:00.000 |
you pull all these record as full load into your inc table
Now the record id 3 gets updated and new record gets inserted
| id | name | age | created_on | last_modified_on |
|---|---|---|---|---|
| 1 | a | 54 | 2022-01-01 | 2022-01-01 00:00:00.000 |
| 2 | b | 52 | 2022-01-01 | 2022-01-01 00:00:00.000 |
| 3 | f | 53 | 2022-01-01 | 2022-01-27 00:00:00.000 |
| 2 | y | 30 | 2022-01-27 | 2022-01-27 00:00:00.000 |
Now you need to identify these new data using last_modifed_on by getting the maximum value of last_modified_on from customer_inc
Solution 1:
use merge logic for the below dataset
select * from customer where last_modified_on > (select max(last_modified_on) from customer_inc)
Solution 2: Always load the previous day(D-1) data as per your requirement.
use merge logic for the below dataset
select * from customer where last_modified_on > cast(getdate()-1 as date)
