Manipulate string values in pandas-CodePudding

I have a pandas dataframe with different formats for one column like this

Name	Values
First	5-9
Second	7
Third	-
Fourth	12-16

I need to iterate over all Values column, and if the format is like the first row 5-9 or like fourth row 12-16 replace it with the mean between the 2 numbers in string. For first row replace 5-9 to 7, or for fourth row replace 12-16 to 14. And if the format is like third row - replace it to 0

I have tried

if df["Value"].str.len() > 1:
    df["Value"] = df["Value"].str.split('-')
    df["Value"] = (df["Value"][0]   df["Value"][1]) / 2
elif df["Value"].str.len() == 1:
    df["Value"] = df["Value"].str.replace('-', 0)

Expected output

Name	Values
First	7
Second	7
Third	0
Fourth	14

CodePudding user response：

Let us split and expand the column then cast values to float and calculate mean along column axis:

s = df['Values'].str.split('-', expand=True)
df['Values'] = s[s != ''].astype(float).mean(1).fillna(0)

     Name  Values
0   First     7.0
1  Second     7.0
2   Third     0.0
3  Fourth    14.0

CodePudding user response：

You can use str.replace with customized replacement function

mint = lambda s: int(s or 0)
repl = lambda m: str(sum(map(mint, map(m.group, [1,2])))/2)
df['Values'] = df['Values'].str.replace('(\d*)-(\d*)', repl, regex=True)

print(df)

     Name Values
0   First    7.0
1  Second      7
2   Third    0.0
3  Fourth   14.0