MySQL how to cross check a single cell of a table, against the entire column of another table?-CodePudding

I have two tables. They are as follows:

TABLE1 Table1 is the dictionary table:

word
------
apple
orange
banana
pear
guava
cherry
peach

TABLE2 Table2 is the setence table:

id | sentence
--- ----------
1  | john loves apple and orange
2  | jane prefers bananna, CARROT and pear
.  | ..
.  | ...
9  | peter likes guava, WATER MELON, CELERY and pear

DESIRED RETURN RESULTS I would like to run for example id=2 row of table2, against every row of table1 such that the query returns:

banana
pear

Or if I run id=9 row of table2 against table1, then the return result should be:

guava
pear

SQL QUERY I HAVE TRIED SO FAR (AND FAILED)

SELECT word 
FROM table1 
WHERE SUBSTRING( word ,  (SELECT sentence FROM table2 WHERE id=2 ) );

This returned an empty set

SELECT word 
FROM table1 
WHERE SUBSTRING( word ,  (SELECT sentence FROM table2 WHERE id=2 ) )>0;

This returned an empty set also

CodePudding user response：

You can USE 'LIKE' in your WHERE part. It matches two Strings like '=' does. BUT you can use % and other symbols as placeholders. E.g.:

SELECT t2.id
 FROM table2 t2 
WHERE t2.sentence LIKE '%apple%'

This would give you every id where the sentence contains 'apple'. Using 'apple%' would give you every entry where the sentence starts with 'apple' and vice versa.

In essence, % is a placeholder for zero, one or multiple characters.

CodePudding user response：

As noted by A-Tech, you can use LIKE with wildcards. You will need to use the CONCAT function to generate the patterns. It is also possible to use RLIKE and not have to concatenate, but you might run into false positives, in which case you would need to add word boundary markers to the regular expressions, which puts you back to using CONCAT or you could add the boundaries to the word list itself. You can also use GROUP_CONCAT to group the matches by id. See this fiddle:

https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=27141936e2402a1a2e02de0153203806

First, with LIKE:

SELECT s.id, w.word
FROM sentences s
JOIN words w on s.sentence like concat('%', w.word, '%');

then with RLIKE:

SELECT s.id, w.word
FROM sentences s
JOIN words w on s.sentence rlike w.word;

then with the maches grouped with GROUP_CONCAT:

SELECT s.id, group_concat(w.word) as matches
FROM sentences s
JOIN words w on s.sentence rlike w.word
GROUP BY s.id;

Then, I add some partial matches to the table to demonstrate the word boundaries. You can see that the next to last query in the fiddle shows how partial matches can confound your results. To avoid that, I add the word boundaries to the built expressions to force only full matches:

SELECT s.id, group_concat(w.word) as matches
FROM sentences s
JOIN words w on s.sentence rlike concat('[[:<:]]', w.word, '[[:>:]]')
GROUP BY s.id;

Probably LIKE is the way to go, but it really depends on your use case and what exactly you are trying to accomplish, and how complicated you want your pattern matching to be. For instance, you might want to be able to match apple and apples, or peach, peaches and peachy, or something more complicated, and you can make regular expressions just about as complex as you need to.

A problem with LIKE and the wild card is that you might want to match on pear but not spear, for instance. RLIKE gives you a little more control over the matching. Also, depending on the size of your data LIKE might outperform RLIKE.

So it all depends on what exactly you are trying to do.