I need to correctly label each row based on certain criteria. For example the data I have is like this: Table Product
Group product_id product_name category
1 123 Egg A
1 456 Egg A
1 456 Milk A
1 789 Milk A
2 135 Apple B
2. 137 Orange B
2. 137 Banana B
2. 139 Strawberry B
3. 235 Egg A
3. 237 Apple B
3. 237 Egg B
3. 239 Orange B
3. 239 Egg B
Since product egg can be found in more than 1 product IDs and milk can be found in more than 1 product IDs, 123,456 and 789 should be marked as A. Basically if a product name appears only once in a group, then it is marked as A, otherwise B.
I was trying to use array functions and compare them, but it doesn't work for this scenario. For example,
select product_id,array_agg(product_name) as p1 from product
Then compare p1 with another array (p2) from the self inner join.
Any hints or help would be greatly appreciated!
CodePudding user response:
Have you considered using a Case When statement?
Case
when product_name = 'egg' and category = 'a' then label = 'egg1'
when product_name = 'egg' and category = 'b' then label = 'egg2'
else 'no label'
End
I am referencing this post https://dba.stackexchange.com/questions/82487/case-with-multiple-conditions for clarity. - J
CodePudding user response:
I am confused with your requirement. You state product name appears only once in a group, then it is marked as A, otherwise B. However, the data you show contains the exact opposite. The following produces what you said you wanted, not the values you posted. It will be correct or exactly the reverse. (See demo)
-- if a product name appears only once in a group, then it is marked as A, otherwise B.
with prod_group (group_id, product_name, cnt) as
( select group_id, product_name, count(*)
from products
group by group_id, product_name
) -- select * from prod_group ;
update products p
set category = case when grp.cnt = 1 then 'A' else 'B' end
from prod_group grp
where ( p.group_id, p.product_name) = ( grp.group_id, grp.product_name);
How it works: The prod_group CTE simply counts the number of times a product name appears in a group. The main "query" then uses this result to update category. Contrary to to your statement case isn't really going to help the CASE expression is exactly what you need.
Note: GROUP is an extremely poor choice for a column name as it is both a Postgres (conditional) and a SQL Standard reserved word.
