I have a dataframe with claims information.
I want to filter out data which has successful payment without Declining.
Ex
| T_ID | Card Number | Status | Amount |
|---|---|---|---|
| 1 | 1234 | PAID | 100 |
| 1 | 1234 | REVERSAL | -30 |
| 2 | 1234567 | PAID | 70 |
The output of this should be
| T_ID | Card Number | Status | Amount |
|---|---|---|---|
| 1 | 1234 | PAID | 70 |
| 2 | 1234567 | PAID | 70 |
Cancelling first 2 transactions as ID and CardNumber is same but it has a corresponding Reversal of -$30
CodePudding user response:
With the provided information, I have made following assumption:
- Amount associated with
PAIDandREVERSALstatus should only be considered. Rest to be ignored. - In the final output, there has to be a
statuscolumn with value set toPAID.
To filter out successful payment without declining, you'll have to provide information about time stamps.
select
T_ID, CardNumber, "PAID" as Status, SUM(Amount) as Amount
from
(
select T_ID, CardNumber,
case
when Status = 'PAID' then Amount
when Status = 'REVERSAL' then Amount
else 0
end as Amount
from
Claim
)
group by
T_ID, CardNumber, Status
It's a very basic implementation but it is as per information provided.
CodePudding user response:
Another approach would be using Common Table Expression (Same can be achieved using Spark SQL)
-- claim table
create or replace table tx(
T_ID number,
Card_Number number,
Status text,
Amount number(10,2)
);
sample data
insert into tx values
(1,1234,'PAID',100.00),
(1,1234,'REVERSAL', -30),
(2,1234567,'PAID',70);
sample query
with total_cte as (
select T_ID,
Card_Number,
sum(Amount) as total_balance
from tx group by T_ID,Card_Number
)
select t.T_ID, t.Card_Number, t.Status, s.total_balance
from tx t join total_cte as s
where t.T_ID = s.T_ID and t.Card_Number = s.Card_Number
and t.status = 'PAID'
;
CodePudding user response:
Another approach would be using Common Table Expression (Same can be achieved using Spark SQL)
-- claim table
create or replace table tx(
T_ID number,
Card_Number number,
Status text,
Amount number(10,2)
);
sample data
insert into tx values
(1,1234,'PAID',100.00),
(1,1234,'REVERSAL', -30),
(2,1234567,'PAID',70);
sample query
with total_cte as (
select T_ID,
Card_Number,
sum(Amount) as total_balance
from tx group by T_ID,Card_Number
)
select t.T_ID, t.Card_Number, t.Status, s.total_balance
from tx t join total_cte as s
where t.T_ID = s.T_ID and t.Card_Number = s.Card_Number
and t.status = 'PAID'
;

