The Data
ROW YEAR PROD KEY DATE
1 2011 APPLE TIME 2011-11-18 00:00:00.000
2 2011 APPLE TIME 2011-11-19 00:00:00.000
3 2013 APPLE NULL 2011-11-18 00:00:00.000
4 2013 APPLE NULL 2011-11-19 00:00:00.000
5 2013 APPLE TIME 2014-04-08 00:00:00.000
6 2013 APPLE DIM 2014-04-09 00:00:00.000
7 2013 APPLE TIME 2014-11-10 10:50:14.113
8 2013 APPLE TIME 2014-11-12 10:46:04.947
9 2013 MELON JAK 2011-10-17 11:01:19.657
10 2013 MELON TIME 2014-11-18 11:19:35.547
11 2013 MELON NULL 2014-11-19 11:19:35.547
12 2013 MELON TIME 2014-11-21 10:32:36.017
13 2014 APPLE JAK 2003-04-10 00:00:00.000
14 2014 APPLE DIM 2003-04-11 00:00:00.000
15 2015 APPLE TIME 2002-09-27 00:00:00.000
16 2015 APPLE NULL 2004-09-28 00:00:00.000
ROW is not a column in the table. Is just to show which records i want.
The question
The above data is partitionned by (YEAR, PROD) and ordered by DATE.
I need to keep all the rows except row 3 and 4 based on the following logic :
- if the first rows of a group (here (YEAR, PROD)) are NULL, discard them.
- 11 and 16 are null but we keep them because they are not first of their group.
Each group has to start with records that have a KEY that is are not null
==> otherwise discard
In other words, i can have : not null, null, not null, null
But i cannot have : null, not null, null, not null
Expected result
ROW YEAR PROD KEY DATE
1 2011 APPLE TIME 2011-11-18 00:00:00.000
2 2011 APPLE TIME 2011-11-19 00:00:00.000
5 2013 APPLE TIME 2014-04-08 00:00:00.000
6 2013 APPLE DIM 2014-04-09 00:00:00.000
7 2013 APPLE TIME 2014-11-10 10:50:14.113
8 2013 APPLE TIME 2014-11-12 10:46:04.947
9 2013 MELON JAK 2011-10-17 11:01:19.657
10 2013 MELON TIME 2014-11-18 11:19:35.547
11 2013 MELON TIME 2014-11-19 11:19:35.547
12 2013 MELON TIME 2014-11-21 10:32:36.017
13 2014 APPLE JAK 2003-04-10 00:00:00.000
14 2014 APPLE DIM 2003-04-11 00:00:00.000
15 2015 APPLE TIME 2002-09-27 00:00:00.000
16 2015 APPLE TIME 2004-09-28 00:00:00.000
I want to do that, so later i have always a non null key at the begginning of each group. In that way, i can later always use the former row to fill a subsequent records which have null value (in this example 11 and 16)
Any observation or suggestion would be much appreciated !
CodePudding user response:
There might be fancier solutions but in essence (you can remove the square brackets if KEY, DATE, etc are not reserved words in your product - I used TSQL):
select *
from Tbl T1
where
/* Do not include if... */
NOT (
t1.[KEY] is null
/* This is part of the first KEY=NULL rows for this group
(no preceding record with KEY<>NULL) */
and not exists
(select 1
from Tbl T3
where T3.[YEAR]=T1.[YEAR]
and T3.PROD=T1.PROD
and T3.[DATE] < T1.[DATE]
and T3.[KEY] is not null
)
/* There are KEY<>NULL values further down */
and exists
(select 1
from Tbl T2
where T2.[YEAR]=T1.[YEAR]
and T2.PROD=T1.PROD
and T2.[DATE] > T1.[DATE]
and T2.[KEY] is not null
)
)
CodePudding user response:
This kind of query could help:
select YEAR, PROD, KEY, DATE
from (
select YEAR, PROD, KEY, DATE,
MIN(CASE WHEN KEY IS NULL THEN DATE ELSE NULL END)
OVER(PARTITION BY YEAR, PROD) AS MIN_NULL_KEY_DATE,
ROW_NUMBER() OVER(PARTITION BY YEAR, PROD ORDER BY DATE ASC) RN
from your_table yt
)rpr
where 1 = 1
and CASE WHEN RN = 1 AND DATE = MIN_NULL_KEY_DATE THEN 0 ELSE 1 END = 1
so what did I try to achieve here: when the key column null we just found the min date based on year and prod columns. And also check that row is the first row of that group or not. If the rn = 1 and the date is equal min date value of when the key is null, then just ignore them in case when.
CodePudding user response:
The following gets the output you desire. I am checking of the value of key column between rows unbounded preceeding and current row, and since NULL has the highest rank, if there are preceeding rows that are not null it would populate the field min_val with a NOT NULL column.
select * from (
select year,prod,key1,date1
,min(key1) over(partition by year,prod order by date1 asc) as min_val
from t
)x
where x.min_val is not null
------ ------- ------ ------------------------- ---------
| year | prod | key1 | date1 | min_val |
------ ------- ------ ------------------------- ---------
| 2011 | APPLE | TIME | 2011-11-18 00:00:00.000 | TIME |
| 2011 | APPLE | TIME | 2011-11-19 00:00:00.000 | TIME |
| 2013 | APPLE | TIME | 2014-04-08 00:00:00.000 | TIME |
| 2013 | APPLE | DIM | 2014-04-09 00:00:00.000 | DIM |
| 2013 | APPLE | TIME | 2014-11-10 10:50:14.113 | DIM |
| 2013 | APPLE | TIME | 2014-11-12 10:46:04.947 | DIM |
| 2013 | MELON | JAK | 2011-10-17 11:01:19.657 | JAK |
| 2013 | MELON | TIME | 2014-11-18 11:19:35.547 | JAK |
| 2013 | MELON | | 2014-11-19 11:19:35.547 | JAK |
| 2013 | MELON | TIME | 2014-11-21 10:32:36.017 | JAK |
| 2014 | APPLE | JAK | 2003-04-10 00:00:00.000 | JAK |
| 2014 | APPLE | DIM | 2003-04-11 00:00:00.000 | DIM |
| 2015 | APPLE | TIME | 2002-09-27 00:00:00.000 | TIME |
| 2015 | APPLE | | 2004-09-28 00:00:00.000 | TIME |
------ ------- ------ ------------------------- ---------
link https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=ae82f64802674aa60005b8e9f534a150
