I'm using spark sql, and I created some Vues to join some data. but I have to join these Vues based on a string column. thats whu I had to use the like operator.
select table.perfume,table2.perfume
from global_temp.gv_table1 table1
join global_temp.gv_table2 table2
on(lower(table1.perfume) like CONCAT('%', lower(table2.perfume), '%') )
but the problem with this query it does not not give all the result, example. there'es a perfume on the table1 called "FlowerBomb" and a perfume on the table2 called "Flowerbomb Eau du parfum", after the join this perfume was not displayed. is there a problem with the like operator ?
CodePudding user response:
You've got the order of the columns wrong in your like expression.
Since table2.perfume contains table1.perfume, the expression should be like this:
on(lower(table2.perfume) like CONCAT('%', lower(table1.perfume), '%') )
CodePudding user response:
you may want to convert this to Spark APIs. Its fairly simple -
result = table1.alias('1').join(
table2.alias('2'),
F.expr("2.perfume like 1.perfume")
)
