I have a time series database of account numbers and their balances. Not all account numbers are present in each date.
Example:
| date | account_id | balance |
|---|---|---|
| 2021-01-01 | 1 | 10.0 |
| 2021-01-01 | 2 | 100.0 |
| 2021-01-01 | 3 | 78.0 |
| 2021-01-02 | 1 | 40.0 |
| 2021-01-02 | 2 | 200.0 |
| 2021-01-03 | 1 | 0.0 |
| 2021-01-04 | 1 | 0.0 |
| 2021-01-05 | 1 | 0.0 |
| 2021-01-06 | 1 | 7.0 |
| 2021-01-06 | 2 | 4.0 |
| 2021-01-06 | 3 | 0.0 |
As you can notice, just the account id 1 appears in all date.
I want a sql query that returns that account.
| date | account_id | balance |
|---|---|---|
| 2021-01-01 | 1 | 10.0 |
| 2021-01-02 | 1 | 40.0 |
| 2021-01-03 | 1 | 0.0 |
| 2021-01-04 | 1 | 0.0 |
| 2021-01-05 | 1 | 0.0 |
| 2021-01-06 | 1 | 7.0 |
This is just an example. It can be many accounts instead of just one.
Edit:
What I've done so far is to generate this query with Python which of course not optimal when the number dates grow:
SELECT *
FROM table_name
WHERE account_id IN (
SELECT DISTINCT account_id FROM table_name WHERE date_prod = '2021-01-01'
INTERSECT
SELECT DISTINCT account_id FROM table_name WHERE date_prod = '2021-01-02'
INTERSECT
SELECT DISTINCT account_id FROM table_name WHERE date_prod = '2021-01-03'
INTERSECT
SELECT DISTINCT account_id FROM table_name WHERE date_prod = '2021-01-04'
INTERSECT
SELECT DISTINCT account_id FROM table_name WHERE date_prod = '2021-01-05'
INTERSECT
SELECT DISTINCT account_id FROM table_name WHERE date_prod = '2021-01-06'
)
CodePudding user response:
You want to compare the unique dates against each account ID with the unique dates across the whole table. Something like this should work.
SELECT table1.date1,
table1.account_id,
table1.balance
FROM table1
INNER JOIN (SELECT account_id,
COUNT(DISTINCT date1) AS distinct_dates
FROM table1
GROUP BY account_id) account_dates ON table1.account_id = account_dates.account_id,
(SELECT COUNT(DISTINCT date1) AS expected_dates FROM table1) entire_table
WHERE account_dates.distinct_dates = entire_table.expected_dates;
CodePudding user response:
Suppose column date_prod is of type DATE.
Given MAX(date_prod) - MIN(date_prod) AS period_in_days .. WHERE account_id = x is a number of days.
and the number of rows per account is COUNT(date_prod) .. WHERE account_id = x.
When both numbers are equal for each account x then you can select values or balances that are present in each date ( daily records ).
It depends on your DBMS which SQL function or query syntax can solve that.
