lv_seconds_back = mv_time_horizon.select(col("max(time_horizon)") * 60).show()
mv_now =spark.sql("select from_unixtime(unix_timestamp()) as mv_now")
local_date_time =mv_now.select(date_format('mv_now', 'HH:mm:ss').alias("local_date_time"))
lv_start =local_date_time.select(col("local_date_time") - expr("INTERVAL $lv_seconds_back seconds"))
How do i substract no of seconds which is in lv_seconds_back variable in the lv start
I tried using expr(interval seconds) but it wont take the variable but takes number.
Also if I need too add that lv_start in the query how do i do that
mt_cache_fauf_r_2= spark.sql("select mt_cache_fauf_r_temp from mt_cache_fauf_r_temp where RM_ZEITPUNKT>= ${lv_start} & RM_ZEITPUNKT <= ${lv_end}")
This doesn't work
CodePudding user response:
Maybe you should remove .show() so that your column is captured in lv_seconds_back variable.
CodePudding user response:
Your first line has two issues. The first one: col("max(time_horizon)") cannot work because the col function expects a column name. Either do expr("max(time_horizon)") or max(col("time_horizon")). Then, the show function displays part of the dataframe but does not return anything. Therefore it does not make sense to assign the result of show to a variable.
If you remove show and with the col call, the result is a dataframe with one row of one element. The first function can get you the Row object from which you can access its only element like this:
lv_seconds_back = mv_time_horizon.select(F.expr("max(time_horizon)") * 60).first()[0]
Then, if you want to substract that value from a timestamp, do it before you convert it to a string:
mv_now = spark.sql(f"select from_unixtime(unix_timestamp() - {lv_seconds_back}) as mv_now")
