Home > Back-end >  Change negative sign into parentheses pandas
Change negative sign into parentheses pandas

Time:02-15

I am attempting to change negative numbers in a DataFrame into an accounting format. The goal is to change numbers that look like this: $-1123.12 into numbers that look like this: ($1123.12).

The DataFrame has the form:

index  col A    col B   col C
0      $11.12  $7.34    66.6%
1      $-9.32  $-2.33   25.0%
2      $0.37   $0.21    56.0%
3      $32.10  $3.21    10.0%

I would like for my data frame to have the form:

index  col A    col B   col C
0      $11.12  $7.34    66.6%
1      ($9.32) ($2.33)  25.0%
2      $0.37   $0.21    56.0%
3      $32.10  $3.21    10.0%

My attempted solution went through each column of my DataFrame using a for loop to see if any columns contained the pattern: $-. If the columns contained said pattern, my goal was to replace said pattern with ($ and terminate the newly formed string with ).

The solution appears as follows:

for col in df.columns:
    if df[col].astype('string').str.contains(pat = "$-", regex = True):
       df[col] = df[col].astype('string').str.replace("$-", "($")   ")"
    else:
       df[col] = df[col]
    return df[col]

However, when I ran this solution I got the following error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

CodePudding user response:

You can match the pattern and reconstruct the output with backreference dataframe wise with following regex:

  • Match pattern: \$-([0-9.] ) captures the negative number
  • Replace pattern: ($\1) refers to captured number with \1 and wrap it in parenthesis.

df.replace(r'\$-([0-9.] )', r'($\1)', regex=True)

   index    col A    col B  col C
0      0   $11.12    $7.34  66.6%
1      1  ($9.32)  ($2.33)  25.0%
2      2    $0.37    $0.21  56.0%
3      3   $32.10    $3.21  10.0%

CodePudding user response:

Your solution doesn't work because

df[col].astype('string').str.contains(pat = "$-", regex = True)

is a boolean Series but if is expecting a truth-value.

You could write it as a boolean condition in np.where and select values instead:

updated = df.select_dtypes(exclude='int').apply(lambda col: np.where(col.str.contains(r'\$-'), '($'   col.str.strip('$-')   ')', col))
df.update(updated)

Output:

   index    col_A    col_B  col_C
0      0   $11.12    $7.34  66.6%
1      1  ($9.32)  ($2.33)  25.0%
2      2    $0.37    $0.21  56.0%
3      3   $32.10    $3.21  10.0%
  • Related