I have a pandas dataframe : column header is called "Location" example contents: "London Arndale Centre" "Manchester Arndale" "Birmingham Central Station" "Newcastle Metro Centre"
2 numpy arrays :
originalLocation = np.array(["London Arndale Centre","Manchester Arndale","Birmingham Central Station","Newcastle Metro Centre")
newLocation = np.array(["London","Manchester","Birmingham","Newcastle"]
i want to create a new column in the pandas : newLocation
the result needs to be the matching column in newLocation, where the location field matches the original location numpy.
example : "London Arndale Centre" needs to be "London" "Manchester Arndale" needs to be "Manchester"
i have tried this , but it throw back errors
df['newLocation'] = newLocation[int(np.where(originalLocation == df['Location'])[0])]
errors : ValueError: ('Lengths must match to compare', (159,), (12,))
what am i doing wrong here ?
CodePudding user response:
It seems like you forgot the commas in your originalLocation array. Also, the int() is not necessary. Updated code:
df_data = ["London Arndale Centre", "Manchester Arndale", "Birmingham Central Station", "Newcastle Metro Centre"]
df = pd.DataFrame(df_data, columns=['Location'])
originalLocation = np.array(["London Arndale Centre", "Manchester Arndale", "Birmingham Central Station", "Newcastle Metro Centre"])
newLocation = np.array(["London","Manchester","Birmingham","Newcastle"])
df['newLocation'] = newLocation[np.where(originalLocation == df['Location'])[0]]
df
Output:
Location newLocation
0 London Arndale Centre London
1 Manchester Arndale Manchester
2 Birmingham Central Station Birmingham
3 Newcastle Metro Centre Newcastle
