I have the following dataframe, and would like to split the name column by the last underscore "_" and assign the last 4 values to a "Date" column. But get an indexing error. How do I accomplish this?
name val
NETUSE_2014 1
NETUSE_2015 1
NETUSE_2016 1
NETUSE_2017 1
NET_ALL_2013 1
NET_ALL_2014 1
NET_ALL_2015 1
NET_ALL_2016 1
df['Year'] = df['name'].str[-4:]
I get this error:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
wet2['date'] = wet2['name'].str[-4:]
I would like the following dataframe:
name val date
NETUSE 1 2014
NETUSE 1 2015
NETUSE 1 2016
NETUSE 1 2017
NET_ALL 1 2013
NET_ALL 1 2014
NET_ALL 1 2015
NET_ALL 1 2016
CodePudding user response:
Your code:
df['Year'] = df['name'].str[-4:]
works (since we don't know how you get df). The error is suggesting that you're trying to modify a copy of a DataFrame. So my guess is df is sliced from another bigger DataFrame without being copied.
You could also try with str.rsplit with n=1. That way, you only split once from the right:
df[['name','date']] = df['name'].str.rsplit('_', 1, expand=True)
Output:
name val date
0 NETUSE 1 2014
1 NETUSE 1 2015
2 NETUSE 1 2016
3 NETUSE 1 2017
4 NET_ALL 1 2013
5 NET_ALL 1 2014
6 NET_ALL 1 2015
7 NET_ALL 1 2016
CodePudding user response:
simply do this !!works!!
df['date'] = df['name'].str[-4:]
df['name'] = df['name'].str[:-5]
output:
name val year
0 NETUSE 1 2014
1 NETUSE 1 2015
2 NETUSE 1 2016
3 NETUSE 1 2017
4 NET_ALL 1 2013
5 NET_ALL 1 2014
6 NET_ALL 1 2015
7 NET_ALL 1 2016
A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
this error occurs when we try to assign value using filter or to a sliced dataframe. using the code mentioned above error doesn't occur
