Home > OS >  Casting string type column percentage to a decimal
Casting string type column percentage to a decimal

Time:01-19

I am working on this dataset having two columns(date->date, sales_%->string) -

date sales_%
20/12/2021 50%
21/12/2021 29%
22/12/2021 60%
23/12/2021 12%
24/12/2021 75%
25/12/2021 28%

I want to convert the sales_% column to a decimal value(by dividing the % by 100). The desired output should look like -

date sales
20/12/2021 0.5
21/12/2021 0.29
22/12/2021 0.6
23/12/2021 0.12
24/12/2021 0.75
25/12/2021 0.28

This is what i have tried -

spark.sql("""select date, cast(cast(replace("sales_%",'%','') as integer)/100) as decimal(2,2)) from table_name""")

I am getting nulls. Let me know if anyone has a better idea.

CodePudding user response:

You are referencing column name as "sales_%" which is interpreted as literal string by Spark. You need to use back-ticks instead of quotes. Also, there is no need cast to integer before division by 100. Try this:

spark.sql("""
    select  date, 
            cast(replace(`sales_%`,'%','')/100 as decimal(2,2)) as sales
    from    table_name
""").show()

# ---------- ----- 
#|      date|sales|
# ---------- ----- 
#|20/12/2021| 0.50|
#|21/12/2021| 0.29|
#|22/12/2021| 0.60|
#|23/12/2021| 0.12|
#|24/12/2021| 0.75|
#|25/12/2021| 0.28|
# ---------- ----- 
  •  Tags:  
  • Related