Home > database >  string.replace deletes whole string in certain cases
string.replace deletes whole string in certain cases

Time:02-02

I have a data frame in which i convert a float64 column into a string and then drop the .0 off of the end of the string this is working for most values but for the value 50.0 its deleting the entire string so I'm left with a null value. Any ideas what could cause this? below is the two transformations I have on the data frame

Dataframe['Column'] = Dataframe['Column'].astype('string')
Dataframe['Column'] = Dataframe['Column'].str.replace('.0','')

from the few values I've checked it only happens to a few and not all, for a few rows the value is 50.0, 50.0, 49.0, 39.0 and after the transformation above I have the values: , ,49,39

CodePudding user response:

As of today, str.replace uses a regex pattern by default (this will change in the future) and a .0 regex means any character followed by 0. So you deleted 50 and .0.

You should either use a non regex replacement:

Dataframe = pd.DataFrame({'Column': ['123.0', '50.0']})
Dataframe['Column'].str.replace('.0', '', regex=False)

or a correct regex:

Dataframe['Column'].str.replace(r'\.0$', '', regex=True)

output:

0    123
1     50
Name: Column, dtype: object
  •  Tags:  
  • Related