I'm working with a table called "du_vertrag", and I'm trying to build a query that selects distinct values for "id" and "status" pairs.
Input Table:
| id | status | date |
|---|---|---|
| 6357990 | submitted | 20201008 |
| 6357990 | submitted | 20201009 |
| 6357990 | submitted | 20201010 |
| 6357990 | submitted | 20201011 |
| 6357990 | activ | 20210501 |
| 3732634 | submitted | 20200909 |
| 3732634 | submitted | 20200910 |
| 3732634 | submitted | 20200911 |
| 3732634 | submitted | 20200901 |
| 3732634 | activ | 20210803 |
| 3732634 | canceled | 20211202 |
| 2644833 | submitted | 20211008 |
| 2644833 | submitted | 20211009 |
| 2644833 | submitted | 20211010 |
| 2644833 | submitted | 20211011 |
| 2644833 | activ | 20220201 |
Expected Output:
| id | status | date |
|---|---|---|
| 6357990 | submitted | 20201008 |
| 6357990 | activ | 20210501 |
| 3732634 | submitted | 20200909 |
| 3732634 | activ | 20210803 |
| 3732634 | canceled | 20211202 |
| 2644833 | submitted | 20211008 |
| 2644833 | activ | 20220201 |
For a single id, I've attempted to solve it using a group by statement over the two fields:
SELECT id, status, date
FROM du_vertrag
where id = '6357990'
group by id, status;
Which returns the correct output for the specific id:
| id | status | date |
|---|---|---|
| 6357990 | submitted | 20201008 |
| 6357990 | activ | 20210501 |
Though, when I remove the WHERE clause condition, and I try using the same GROUP BY clause (also in combination with DISTINCT), I can't get the original output for every id.
Can you help me with this problem?
CodePudding user response:
If you need distinct values for each "id" and "status", you can aggregate over those two fields and select the minimum date to get your output:
SELECT id,
status,
MIN(date) AS date
FROM du_vertrag
GROUP BY id,
status
If you can share what your DBMS is, I can link a fiddle where you can test this solution.
