Want to check if the column has values that have certain length and contains only digits.
The problem is that the .rlike or .contains returns a Column type. Something like
.when(length(col("abc")) == 20 & col("abc").rlike(...), myValue)
won't work as col("abc").rlike(...) will return Column and unlike length(col("abc")) == 20 which returns Boolean (length() however also returns Column). How do I combine the two?
CodePudding user response:
After doing a bit of searching in compiled code, found this
def when(condition : org.apache.spark.sql.Column, value : scala.Any) : org.apache.spark.sql.Column
Therefore the conditions in when must return Column type. length(col("abc")) == 20 was evaluating to Boolean.
Also, found this function with the following signature
def equalTo(other : scala.Any) : org.apache.spark.sql.Column
So, converted the whole expression to this
.when(length(col("abc")).equalTo(20) && col("abc").rlike(...), myValue)
Note that the logical operator is && and not &.
Edit/Update : @Histro's comment is correct.
