cause I don't really understand why it works in one format and doesn't work in another.
Works:
df['team'] = df['team'].str.extract(r'(\w ) ')
Doesn't work:
def clear_teams(gr):
return gr.str.extract(r'(\w ) ')
df['team'] = df['team'].apply(clear_teams)
I recive an error:
AttributeError: 'str' object has no attribute 'str'
Why it doesn't work, can someone explain it to me ? pleas :) How it has str attribute one time and another doesn't ....
CodePudding user response:
If use Series.apply then in function gr is scalar, function loop by element of Series. So cannot use Series functions for it like str.extract.
def clear_teams(gr):
print (type(gr))
print (gr)
return gr.str.extract(r'(\w ) ')
If use Series.pipe then gr is Series, co all working correct:
df['team'] = df['team'].pipe(clear_teams)
Or:
df['team'] = clear_teams(df['team'])
CodePudding user response:
When you use apply function on a pd.Series, the referenced function gets the actual cell value for each value:
def clear_teams(gr):
# if df['team'] dtype is str, than gr is string, 'str' attribute is a 'pd.Series` attribute
return gr.extract(r'(\w ) ')
df['team'] = df['team'].apply(clear_teams)
