I have a simple Spark DataFrame with column ID with integer values 1, 2, etc.:
--- -------
| ID| Tags |
--- -------
| 1| apple |
| 2| kiwi |
| 3| pear |
--- -------
I want to check if value like 2 is in the column ID in any row, filter method is only useful for string columns. Any ideas?
UPDATE:
I was trying with:
df.filter(df.ID).contains(2)
At the end I need boolean True or False output.
CodePudding user response:
No. Filter can filter other data types also.
dataDictionary = [
(1,"APPLE"),
(2,"KIWI"),
(3,"PEAR")
]
df = spark.createDataFrame(data=dataDictionary, schema = ["ID","Tags"])
df.printSchema()
df.show(truncate=False)
df.filter("ID==2").rdd.isEmpty() #Will return Boolean.

