Home > Mobile >  How to pass a variable into an Pyspark sequence to generate time series?
How to pass a variable into an Pyspark sequence to generate time series?

Time:01-18

I want to generate a time series, from 2021-12-01 to 2021-12-31, but I want to pass the values with variables into de function secuence.

This is my code:

spark = SparkSession.builder.appName('sparkdf').getOrCreate()

TyP_dias = spark.createDataFrame([('null','null')], ['MES','NEGOCIO'])

TyP_df0 = TyP_dias.withColumn('FECHA', sf.explode(sf.expr("sequence(to_date('2021-12-01'), to_date('2021-12-31'), interval 1 day)"))).show()

I want the values 2021-12-01 and 2021-12-31 inside variables.

Something like:

spark = SparkSession.builder.appName('sparkdf').getOrCreate()

TyP_dias = spark.createDataFrame([('null','null')], ['MES','NEGOCIO'])

eldia1 = '2021-12-01'
eldia2 = '2021-12-31'

TyP_df0 = TyP_dias.withColumn('FECHA', sf.explode(sf.expr("sequence(to_date(eldia1), to_date(eldia2), interval 1 day)"))).show()

And get this result:

 ---- ------- ---------- 
| MES|NEGOCIO|     FECHA|
 ---- ------- ---------- 
|null|   null|2021-12-01|
|null|   null|2021-12-02|
|null|   null|2021-12-03|
|null|   null|2021-12-04|
|null|   null|2021-12-05|
|null|   null|2021-12-06|
|null|   null|2021-12-07|
|null|   null|2021-12-08|

But instead I'm reciving:

cannot resolve 'eldia1' given input columns: [MES, NEGOCIO];

CodePudding user response:

Easiest would be to use Python string formatting to add the variable content to your sql expression.

TyP_df0 = TyP_dias.withColumn('FECHA', sf.explode(sf.expr(f"sequence(to_date('{eldia1}'), to_date('{eldia2}'), interval 1 day)"))).show()

 ---- ------- ----------                                                        
| MES|NEGOCIO|     FECHA|
 ---- ------- ---------- 
|null|   null|2021-12-01|
|null|   null|2021-12-02|
|null|   null|2021-12-03|
|null|   null|2021-12-04|
|null|   null|2021-12-05|
|null|   null|2021-12-06|
|null|   null|2021-12-07|
|null|   null|2021-12-08|
|null|   null|2021-12-09|
|null|   null|2021-12-10|
|null|   null|2021-12-11|
|null|   null|2021-12-12|
|null|   null|2021-12-13|
|null|   null|2021-12-14|
|null|   null|2021-12-15|
|null|   null|2021-12-16|
|null|   null|2021-12-17|
|null|   null|2021-12-18|
|null|   null|2021-12-19|
|null|   null|2021-12-20|
 ---- ------- ---------- 
only showing top 20 rows
  •  Tags:  
  • Related