As shown below, name must be keep in fisrt and team in last.
How can I accomplish this with .drop_duplicates() or otherwise?
name team ...
0 john a ...
1 mike b ...
2 john c
↓
name team ...
0 john c ...
1 mike b ...
-- Additional note about comments --
.groupby('name').agg({'team': 'last', 'country': 'first'})
The way it works now, if the first line of country is Nan
If the first line of country is Nan, a value that is not the first will be obtained as follows.
Is this because the case of Nan is ignored?
Even if first is specified and first is Nan, Nan must still be retained.
name team country ...
0 john a Nan ...
1 mike b Brazil ...
2 john c Canada ...
↓
name team country ...
0 john c Canada ...
1 mike b Brazil ...
CodePudding user response:
You can use the .groupby() function:
df.groupby('name').agg({'team': 'last'}).
Be aware that in the value that's returned per name is dependent on the sorting of your dataframe.
