Home > Blockchain >  (SOLVED) How use pyspark to modify all row of one column?
(SOLVED) How use pyspark to modify all row of one column?

Time:01-31

I have a pyspark dataframe with more columns. I'm going to concatenate a phrase to each row of one column. For example:

         "date"                  "other columns"
2022-01-11 19:51:37 00:00              ...
2022-01-11 20:51:55 00:00              ...

I would modify all row of "date" cutting what comes next the hour and adding "00:00 00:00". So text will become:

         "date"                  "other columns"
2022-01-11 19:00:00 00:00              ...
2022-01-11 20:00:00 00:00              ...

CodePudding user response:

Given that they are strings, you could do that using the following

from pyspark.sql import functions

df = (df.withColumn("date", functions.concat(functions.substring('date', 0,14), 
                                             functions.lit("00:00 00:00"))))
  •  Tags:  
  • Related