Home > Blockchain >  Joining dataframe and replacing column value in scala
Joining dataframe and replacing column value in scala

Time:01-21

I am trying to join two apache spark sql DataFrame and replace column value of first dataframe with another. Eg:

Df1:

col1 | col2  |  other columns .... say (col-x, col-y, col-z)
------------ |--------------------------------
x    |  a    |random values
y    |  b    |random values
z    |  c    |random values

Df2:

col1 | col3  | other columns .. say (col-a, col-b, col-c)
-------------|--------------------------------
x    |  a1   |different random values
y    |  b1   |different random values
w    |  w1   |different random values

resultant dataframe should be

DF:

col1 | col2  | other columns of DF1 (col-x. col-y, col-z)
-------------|-------------------------------
a1   |  a    |random values
b1   |  b    |random values
z    |  c    |random values

I need to perform left join and replace values of DF1.col1 with DF2.col3 wherever DF1.col1 = DF2.col1. I am not sure how to do that. Furthermore, as it can be seen in above example, DF1 has a lot more columns other than "col1" and "col2" and I cannot apply select on all of them. I was trying something like,

val df = df1.join(df2, Seq("col1"), "left").select(
  coalesce(df2("col2"), df1("col1")).as("col1")
)

but this doesn't seem to work. Also, I think it will filter out other columns of DF1. I want to keep all columns of DF1.

How can I do this in Scala?

CodePudding user response:

You can construct the required 3 columns as follows.

val df = df1.join(df2, Seq("col1"), "left").select(coalesce(df2("col3"), df1("col1")).as("col1"),col("col2"), col("colx"))

CodePudding user response:

For get all columns from "df1" after join, alias can be used for Dataframe:

val updatedCol1 = coalesce(df2("col3"), df1("col1")).alias("col1")
val columns = updatedCol1 :: df1.columns
  .filterNot(_ == "col1")
  .map(cname => col("df1."   cname))
  .toList

df1.alias("df1")
  .join(df2, Seq("col1"), "left")
  .select(columns: _*)
  •  Tags:  
  • Related