Got this DataFrame:
| Type | String | ext_id | int_id |
|---|---|---|---|
| 1 | UKidBC | 2393 | 2820 |
| 1 | UKidBC | 4816 | 1068 |
| 0 | UKidBC | 4166 | 3625 |
| 0 | UKidBC | 2803 | 1006 |
| 1 | UKidBC | 1189 | 2697 |
For each value on String column, I need to replace the substring 'id' (UKidBC) according to the following rule:
If df['Type'] = 1 then replace substring 'id' with the corresponding df['int_id'] value else replace substring 'id' with the corresponding df['ext_id'] value.
I tried to use that line:
new_df.apply(lambda x: x['string'].replace(pat=['id'],
repl=x['int_id']) if x['Type'] == 1
else x['string'].replace(pat=['id'],repl=x['ext_id']),axis=1)
Keep getting this error:
str.replace() takes no keyword arguments
What I am doing wrong here?
CodePudding user response:
Instead of apply, we could use str.split np.where to replace values according to "Type" value:
tmp = df['String'].str.split('id', expand=True)
df['String'] = tmp[0] np.where(df['Type'].astype(bool), df['int_id'].astype(str), df['ext_id'].astype(str)) tmp[1]
Output:
Type String ext_id int_id
0 1 UK2820BC 2393 2820
1 1 UK1068BC 4816 1068
2 0 UK4166BC 4166 3625
3 0 UK2803BC 2803 1006
4 1 UK2697BC 1189 2697
CodePudding user response:
Assuming your string is fixed, use numpy.where and vector string concatenation:
df['String'] = df['String'].str[:2] np.where(df['Type'].eq(1), df['int_id'], df['ext_id']) df['String'].str[4:]
CodePudding user response:
You can use .str.extract and np.where:
df['String'] = df['String'].str.extract(r'(?P<g0>. )id(?P<g2>. )').assign(g1=np.where(df['Type'] == 1, df['int_id'], df['ext_id']).astype(str)).sort_index(axis=1).agg(list, axis=1).str.join('')
Output:
>>> df
Type String ext_id int_id
0 1 UK2820BC 2393 2820
1 1 UK1068BC 4816 1068
2 0 UK4166BC 4166 3625
3 0 UK2803BC 2803 1006
4 1 UK2697BC 1189 2697
CodePudding user response:
Use the same idea as yours (apply(), replace()), just modify a bit about using replace().
new_df["String"] = new_df.apply(
lambda row: row["String"].replace("id", row["int_id"]) if row["type"] == 1 else row["String"].replace("id", row["ext_id"]),
axis=1
)
output:
Type String ext_id int_id 0 1 UK2820BC 2393 2820 1 1 UK1068BC 4816 1068 2 0 UK4166BC 4166 3625 3 0 UK2803BC 2803 1006 4 1 UK2697BC 1189 2697
CodePudding user response:
This question honestly looks like one of those coding challenges you see.
Assuming that your dataframe variable is new_df:
for i in new_df:
i["string"].replace("id", i["int_id"] if i["type"] else i["ext_id"])
What you did wrong is (as the error says) you gave keyword arguments to str.replace, which does not take kwargs. Instead, the first argument is the pattern to replace, and the second is what to replace it with.
CodePudding user response:
List comprehension with np.where may serve you fast:
strings = np.where(df['Type'].eq(1),df['int_id'],df['ext_id']).astype(str)
df['String'] = [a.replace("id",b) for a,b in zip(df['String'],strings)]
print(df)
Type String ext_id int_id
0 1 UK2820BC 2393 2820
1 1 UK1068BC 4816 1068
2 0 UK4166BC 4166 3625
3 0 UK2803BC 2803 1006
4 1 UK2697BC 1189 2697
