My requirement is to remove trailing zeros from decimal value, I have tried regex and strip() to remove trailing zeros it worked but we use regex and strip for string datatype, I want Col_2 to be decimal without changing the precision and scale. Can someone please suggest alternative.
Sample Table :
Col_1 : string
Col_2 : decimal(18,2)
| Col_1 | Col_2 |
|---|---|
| ABC | 2.00 |
| DEF | 2.50 |
| XMN | 0.00 |
| RST | 1.28 |
Actual Result using regex:
Col_1 : string
Col_2 : string
| Col_1 | Col_2 |
|---|---|
| ABC | 2 |
| DEF | 2.5 |
| XMN | |
| RST | 1.28 |
Expected Result :
| Col_1 | Col_2 |
|---|---|
| ABC | 2 |
| DEF | 2.5 |
| XMN | 0 |
| RST | 1.28 |
CodePudding user response:
decimal(18,2) type will always store those 2 digits after the comma. Displaying the trailing zeros on the right side of the comma is just a matter of formatting.
So, if you want Col_2 to be in decimal and preserve the precision then store it as decimal(18,2) and format it as you want when displaying the data.
You can use for that format_number function:
import pyspark.sql.functions as F
df.withColumn("Col_2", F.expr("format_number(Col_2, '0.##')")).show()
# ----- -----
#|Col_1|Col_2|
# ----- -----
#| ABC| 2|
#| DEF| 2.5|
#| XMN| 0|
#| RST| 1.28|
# ----- -----
