I have a table where two columns have values separated by a comma, and another table of categories. group_1 can have anything up to 30 separated values and group_2 can have anything up to 5 separated values
myProducts
| alias | group_1 | group_2 |
|---|---|---|
| product_a | 1,2,3,[...] | uk, us [...] |
| product_b | 2,4,[...] | uk, us, [...] |
| product_c | 1,4,[...] | spain, germany, [...] |
myCategories
| id | category |
|---|---|
| 1 | category_a |
| 2 | category_b |
| 3 | category_c |
| 4 | category_d |
Is it possible with MySQL to split both the comma separated fields into multiple rows and return the results after looking up the value from the categories table. So in the example above, the FIRST row of the original table would return:
| alias | group_1 | group_2 |
|---|---|---|
| product_a | category_a | uk |
| product_a | category_a | us |
| product_a | category_b | uk |
| product_a | category_b | us |
| product_a | category_c | uk |
| product_a | category_c | us |
| ... | ... | ... |
The lookup part is desired, but if that proves to be too complicated, I can live without that part.
CodePudding user response:
Yes, but you won't be happy with the performance.
You can match a comma-separated list against an individual value using MySQL's FIND_IN_SET() function.
select p.alias, a.category as group_1, c.country as group_2
from myProducts p join myCategories a on find_in_set(a.id, p.group_1)
join countries c on find_in_set(c.country, p.group_2);
----------- ------------ ---------
| alias | group_1 | group_2 |
----------- ------------ ---------
| product_c | category_a | spain |
| product_c | category_d | spain |
| product_a | category_a | uk |
| product_a | category_b | uk |
| product_b | category_b | uk |
| product_a | category_c | uk |
| product_b | category_d | uk |
----------- ------------ ---------
I did create another lookup table countries:
create table countries (country varchar(20) primary key);
insert into countries values ('uk'),('us'),('spain'),('germany');
Caveat: if the comma-separated list has spaces, they will be treated as part of each string in the list, so you want to remove spaces.
select p.alias, a.category as group_1, c.country as group_2
from myProducts p join myCategories a on find_in_set(a.id, p.group_1)
join countries c on find_in_set(c.country, replace(p.group_2,' ',''));
----------- ------------ ---------
| alias | group_1 | group_2 |
----------- ------------ ---------
| product_c | category_a | germany |
| product_c | category_d | germany |
| product_c | category_a | spain |
| product_c | category_d | spain |
| product_a | category_a | uk |
| product_a | category_b | uk |
| product_b | category_b | uk |
| product_a | category_c | uk |
| product_b | category_d | uk |
| product_b | category_b | us |
| product_b | category_d | us |
----------- ------------ ---------
But there's no way to optimize the lookups with indexes if you do this. So every join will be a table-scan. As your tables gets larger, you'll find the performance degrades to the point of being unusable.
The way to optimize this is to avoid using comma-separated lists. Normalize many-to-many relationships into new tables. Then the lookups can use indexes, and you'll avoid the degraded performance, in addition to all the other problems with using comma-separated lists.
Re your comment:
You can create a derived table by listing countries explicitly:
FROM ...
JOIN (
SELECT 'us' AS country UNION SELECT 'uk' UNION SELECT 'spain' UNION SELECT 'germany'
) AS c
But this is getting pretty ridiculous. You aren't using SQL to any advantage. You might as well just fetch the whole dataset back into your client application and sort it into some data structures in memory.
