I have a pyspark dataframe with more columns. I'm going to concatenate a phrase to each row of one column. For example:
"date" "other columns"
2022-01-11 19:51:37 00:00 ...
2022-01-11 20:51:55 00:00 ...
I would modify all row of "date" cutting what comes next the hour and adding "00:00 00:00". So text will become:
"date" "other columns"
2022-01-11 19:00:00 00:00 ...
2022-01-11 20:00:00 00:00 ...
CodePudding user response:
Given that they are strings, you could do that using the following
from pyspark.sql import functions
df = (df.withColumn("date", functions.concat(functions.substring('date', 0,14),
functions.lit("00:00 00:00"))))
