I have the following pandas dataframe:
import pandas as pd
df = pd.DataFrame({'NAME': ['Paris', 'New York', 'Rio'],
'GEO': ['POINT (48.85 2.31647)',
'POINT (40.731499671618 -73.993457389558)',
'POINT (-22.9 -43.2)']})
print(df)
NAME GEO
Paris POINT (48.85 2.31647)
New York POINT (40.731499671618 -73.993457389558)
Rio POINT (-22.9 -43.2)
I need to separate the GEO column into two columns. One column to store latitude and another column to store longitude.
So, based on this code: Adding Lat Lon coordinates to separate columns (python/dataframe) , I implemented the following:
df['GEO'].str('POINT ()').str.strip(' ', expand=True).rename(columns={0:'LAT', 1:'LONG'})
However, it is giving the error: "TypeError: 'StringMethods' object is not callable"
I would like the output to be:
NAME GEO LAT LONG
Paris POINT (48.85 2.31647) 48.85 2.31647
New York POINT (40.731499671618 -73.993457389558) 40.731499671618 -73.993457389558
Rio POINT (-22.9 -43.2) -22.9 -43.2
CodePudding user response:
You could use a regex:
df2 = df.join(df['GEO'].str.extract(r'(?P<LAT>-?\d \.\d ) (?P<LONG>-?\d \.\d )'))
output:
NAME GEO LAT LONG
0 Paris POINT (48.85 2.31647) 48.85 2.31647
1 New York POINT (40.731499671618 -73.993457389558) 40.731499671618 -73.993457389558
2 Rio POINT (-22.9 -43.2) -22.9 -43.2
or, to get float:
df2 = df.join(df['GEO'].str.extract(r'(?P<LAT>-?\d \.\d ) (?P<LONG>-?\d \.\d )'))
.astype(float))
output:
NAME GEO LAT LONG
0 Paris POINT (48.85 2.31647) 48.8500 2.316470
1 New York POINT (40.731499671618 -73.993457389558) 40.7315 -73.993457
2 Rio POINT (-22.9 -43.2) -22.9000 -43.200000
CodePudding user response:
You were very close. But the .str function is not callable. You can not invoke it as .str(). Modify your code to this and it works (but won't be quite as sleek as your one-liner)
df[['POINT', 'LAT', 'LONG']] = df['GEO'].str.split(' ', expand=True).rename(columns=({0:'POINT', 1:'LAT', 2:'LONG'}))
df['LAT'] = df['LAT'].str.replace('(','')
df['LONG'] = df['LONG'].str.replace(')','')
You can then delete df['POINT']
