I have the following overloaded method which input can be a Option[String] or Option[Seq[String]]:
def parse_emails(email: => Option[String]) : Seq[String] = {
email match {
case Some(e : String) if e.isEmpty() => null
case Some(e : String) => Seq(e)
case _ => null
}
}
def parse_emails(email: Option[Seq[String]]) : Seq[String] = {
email match {
case Some(e : Seq[String]) if e.isEmpty() => null
case Some(e : Seq[String]) => e
case _ => null
}
}
I want to use this method from Spark, so I tried to wrap them as a udf:
def parse_emails_udf = udf(parse_emails _)
But I am getting the following error:
error: ambiguous reference to overloaded definition,
both method parse_emails of type (email: Option[Seq[String]])Seq[String]
and method parse_emails of type (email: => Option[String])Seq[String]
match expected type ?
def parse_emails_udf = udf(parse_emails _)
Is it possible to define a udf which could wrap both alternative?
Or could it be possible to create two udfs with same name each pointing to one of the overloaded options? I tried below approach, but throws another error:
def parse_emails_udf = udf(parse_emails _ : Option[Seq[String]])
error: type mismatch;
found : (email: Option[Seq[String]])Seq[String] <and> (email: => Option[String])Seq[String]
required: Option[Seq[String]]
def parse_emails_udf = udf(parse_emails _ : Option[Seq[String]])
CodePudding user response:
Option[String] and Option[Seq[String]] have the same erasure Option, so even if Spark supported udf overloading it wouldn't work.
What you can do is create one function that accepts anything, then match on the argument and handle the different cases:
def parseEmails(arg: Option[AnyRef]) = arg match {
case Some(x) =>
x match {
case str: String =>
??? // todo
case s: Seq[String] =>
??? // todo
case _ =>
throw new IllegalArgumentException()
}
case None =>
??? // todo
}
