I'm trying to understand this query statement.
write a query to show % of total profit that the top 10% of businesses generated.
1st : get total profit from all businesses
2nd : get % of all profits generated by very single business
3rd : sort that profit % by desc
4th : get top 10
that's what I've understood.
But somewhere am thinking its not correct what I've thought.
Can anyone please suggest what could be actual ask of this query ?
Note: cant ask back from where I got, because this was in interview :)
Thanks
CodePudding user response:
- Order the businesses by amount of profit.
- Take the top 10% of the total number of businesses (with or without ties).
- Calculate the total profit for that 10% of businesses.
- Compare it to the total profit for 100% of the businesses.
You can use the PERCENT_RANK analytic function to give a percentage rank to the businesses in order of profit and then use conditional aggregation to sum the profit for those businesses with the top 10% profit and compare it to the total profit to get the percentage profit made by the top 10% of businesses:
SELECT SUM(CASE WHEN pct_rnk <= 0.1 THEN profit END)
/ SUM(profit) * 100 AS percent_profit_for_top_10_percent
FROM (
SELECT profit,
PERCENT_RANK() OVER (ORDER BY profit DESC) AS pct_rnk
FROM businesses
)
You can also use:
SELECT (SELECT SUM(profit)
FROM (
SELECT profit
FROM businesses
ORDER BY profit DESC
FETCH FIRST 10 PERCENT ROWS WITH TIES
)
)
/
(SELECT SUM(profit)
FROM businesses)
* 100 AS percent_profit_for_top_10_percent
FROM DUAL;
Which, for the sample data:
CREATE TABLE businesses (profit) AS
SELECT 1000 FROM DUAL CONNECT BY LEVEL <= 10 UNION ALL
SELECT 2000 FROM DUAL CONNECT BY LEVEL <= 8 UNION ALL
SELECT 3000 FROM DUAL CONNECT BY LEVEL <= 6 UNION ALL
SELECT 4000 FROM DUAL CONNECT BY LEVEL <= 4 UNION ALL
SELECT 5000 FROM DUAL CONNECT BY LEVEL <= 2 UNION ALL
SELECT 10000 FROM DUAL CONNECT BY LEVEL <= 1;
Both output:
PERCENT_PROFIT_FOR_TOP_10_PERCENT 45
db<>fiddle here
