I have two DFs
df1:
--- ----- --------
|key|price| date|
--- ----- --------
| 1| 1.0|20210101|
| 2| 2.0|20210101|
| 3| 3.0|20210101|
--- ----- --------
df2:
--- -----
|key|price|
--- -----
| 1| 1.1|
| 2| 2.2|
| 3| 3.3|
--- -----
I'd like to replace price column values from df1 with price values from df2 where df1.key == df2.key
Expected output:
--- ----- --------
|key|price| date|
--- ----- --------
| 1| 1.1|20210101|
| 2| 2.1|20210101|
| 3| 3.3|20210101|
--- ----- --------
I've found some solutions in python but I couldn't come up with a working solution in Scala.
CodePudding user response:
Simply join drop df1 column price:
val df = df1.join(df2, Seq("key")).drop(df1("price"))
df.show
// --- ----- --------
//|key|price| date|
// --- ----- --------
//| 1| 1.1|20210101|
//| 2| 2.2|20210101|
//| 3| 3.3|20210101|
// --- ----- --------
Or if you have more entries in df1 and you want to keep their price when there is no match in df2 then use left join coalesce expression:
val df = df1.join(df2, Seq("key"), "left").select(
col("key"),
col("date"),
coalesce(df2("price"), df1("price")).as("price")
)
