Home > database >  Compare values between two tables with over partition criteria
Compare values between two tables with over partition criteria

Time:01-13

DB-Fiddle

/* Table Campaigns */
CREATE TABLE campaigns (
    id SERIAL PRIMARY KEY,
    insert_time DATE,
    campaign VARCHAR,
    tranches VARCHAR,
    quantity DECIMAL);

INSERT INTO campaigns
(insert_time, campaign, tranches, quantity)
VALUES 
('2021-01-01', 'C001', 't', '500'),
('2021-01-01', 'C002', 't', '600'),

('2021-01-02', 'C001', 't', '500'),
('2021-01-02', 'C002', 't', '600');


/* Table Tranches */
CREATE TABLE tranches (
    id SERIAL PRIMARY KEY,
    insert_time DATE,
    campaign VARCHAR,
    tranches VARCHAR,
    quantity DECIMAL);

INSERT INTO tranches
(insert_time, campaign, tranches, quantity)
VALUES 
('2021-01-01', 'C001', 't1', '200'),
('2021-01-01', 'C001', 't2', '120'),
('2021-01-01', 'C001', 't3', '180'),

('2021-01-01','C002', 't1', '350'),
('2021-01-01','C002', 't2', '250'),

('2021-01-02', 'C001', 't1', '400'),
('2021-01-02', 'C001', 't2', '120'),
('2021-01-02', 'C001', 't3', '180'),

('2021-01-02','C002', 't1', '350'),
('2021-01-02','C002', 't2', '250');

Expected Result:

insert_time   |  campaign  |  tranches  |  quantity_campaigns |  quantity_tranches  |    check
--------------|------------|------------|---------------------|---------------------|-----------
2021-01-01    |    C001    |      t     |          500        |         500         |    ok
2021-01-01    |    C002    |      t     |          600        |         600         |    ok
--------------|------------|------------|---------------------|---------------------|------------
2021-01-02    |    C001    |      t     |          500        |         700         |    error   
2021-01-02    |    C002    |      t     |          600        |         500         |    ok

I want to compare the total quantity per campaign in table campaigns with the total quantity per campaign in table tranches.

So far I have been able to develop this query:

SELECT
c.insert_time AS insert_time,
c.campaign AS campaign,
c.tranches AS tranches,
c.quantity AS quantity_campaigns,
t.quantity AS quantity_tranches,

(CASE WHEN 
MAX(c.quantity) OVER(PARTITION BY c.insert_time, c.campaign) = SUM(t.quantity) OVER(PARTITION BY t.insert_time, t.campaign) 
THEN 'ok' ELSE 'error' END) AS check

FROM campaigns c
LEFT JOIN tranches t ON c.campaign = t.campaign
ORDER BY 1,2,3,4,5;

However, it does not give me the expected result?
What do I need to change to make it work?

CodePudding user response:

I think the result you're looking for should be something like this. The problem is that you're trying to aggregate over two groupings after a join which will either yield too many results or incorrect calculations. By aggregating in CTE, and then joining the CTEs after aggregation has occurred you can achieve the results you are looking for. See my example below:

WITH campaign_agg AS(
  SELECT c.insert_time, c.campaign, c.tranches, MAX(c.quantity) c_quantity
  FROM campaigns c
  GROUP BY c.insert_time, c.campaign, c.tranches
), tranch_agg AS(
  SELECT t.insert_time, t.campaign, SUM(t.quantity) as t_sum
  FROM tranches t
  GROUP BY t.insert_time, t.campaign
)
SELECT c.insert_time, c.campaign, c.tranches, c.c_quantity, t.t_sum,
    CASE WHEN c.c_quantity = t.t_sum THEN 'ok' ELSE 'error' END as check
FROM campaign_agg c
JOIN
    tranch_agg t ON 
        t.insert_time = c.insert_time
        AND t.campaign = c.campaign
ORDER BY c.insert_time, c.campaign

I have a db-fiddle for this as well: https://www.db-fiddle.com/f/33x4upVEcgTMNehiHCKzfN/1

CodePudding user response:

DB-Fiddle

SELECT
c.insert_time AS insert_time, 
c.campaign AS campaign, 
c.tranches AS tranches, 
SUM(c.quantity) AS quantity_campaigns,
SUM(t1.quantity) AS quantity_tranches,
(CASE WHEN SUM(c.quantity) <> SUM(t1.quantity) THEN 'error' ELSE 'ok' END) AS check
FROM campaigns c
LEFT JOIN

   (SELECT
   t.insert_time AS insert_time,
   t.campaign AS campaign,
   SUM(t.quantity) AS quantity
   FROM tranches t
   GROUP BY 1,2
   ORDER BY 1,2) t1 on t1.insert_time = c.insert_time AND t1.campaign = c.campaign
   
GROUP BY 1,2,3
ORDER BY 1,2,3;
  •  Tags:  
  • Related