Home > Enterprise >  Convert amounts in string format with commas and trailing minus mark to numeric or float
Convert amounts in string format with commas and trailing minus mark to numeric or float

Time:01-27

I have a column with numeric values in a string format with thousand separator commas and the minus marks stated at the end of the string for minus values.

df1 = pd.DataFrame({'Name':['John', 'Tom', 'Simon', 'Jose','Jose'],
                     'Amount':['1,000.00','1,000.00-', '100.00', '100.00-','1,000,000.00-'],
                     'ExpectedAmount':[1000.00,-1000.00, 100.00, -100.00,-1000000.00],
                    })

I want to convert the values in the "Amount" column to a float while preserving whether the value is a plus or minus.

I can remove the commas using "replace"

df1['Amount1'] = df1['Amount'].str.replace(',','')

But I'm unable to change the position of the minus mark from the end of the string to the beginning of the string so that I can change the format from string to float. Can someone help me to do this?

CodePudding user response:

You can swap minus sign with
df1.Amount.apply(lambda x: '-' x[:-1] if x[-1] == '-' else x)

CodePudding user response:

Here is a vectorial solution:

df1['ExpectedAmount'] = (df1['Amount'].str.replace('[^\d.]', '', regex=True).astype(float)
                        *df1['Amount'].str.endswith('-').mul(-2).add(1))

Remove all non digit/dot and convert to float

Multiply by -1 if the initial string ends with '-'

output:

    Name         Amount  ExpectedAmount
0   John       1,000.00          1000.0
1    Tom      1,000.00-         -1000.0
2  Simon         100.00           100.0
3   Jose        100.00-          -100.0
4   Jose  1,000,000.00-      -1000000.0

CodePudding user response:

Use Series.replace for swap - from back to start of string, remove , and casting to numeric:

df1['Amount'] = (df1['Amount'].replace({r'(.*)(-)$': r'\2\1', ',':''}, regex=True)
                              .astype(float))
print (df1)
    Name     Amount  ExpectedAmount
0   John     1000.0          1000.0
1    Tom    -1000.0         -1000.0
2  Simon      100.0           100.0
3   Jose     -100.0          -100.0
4   Jose -1000000.0      -1000000.0
  •  Tags:  
  • Related