I have a column in a df that I want to split into two columns splitting by comma delimiter. If the value in that column does not have a comma I want to put that into the second column instead of first.
| Origin |
|---|
| New York, USA |
| England |
| Russia |
| London, England |
| California, USA |
| USA |
I want the result to be:
| Location | Country |
|---|---|
| New York | USA |
| NaN | England |
| NaN | Russia |
| London | England |
| California | USA |
| NaN | USA |
I used this code
df['Location'], df['Country'] = df['Origin'].str.split(',', 1)
CodePudding user response:
We can try using str.extract here:
df["Location"] = df["Origin"].str.extract(r'(.*),')
df["Country"] = df["Origin"].str.extract(r'(\w (?: \w )*)$')
CodePudding user response:
Another option is to use np.where. Split on comma and depending on the lengths of the created lists choose items.
s = df['Origin'].str.split(',')
msk = s.str.len()>1
df["Location"] = np.where(msk, s.str[0], np.nan)
df["Country"] = np.where(msk, s.str[1], s.str[0])
Output:
Origin Location Country
0 New York, USA New York USA
1 England NaN England
2 Russia NaN Russia
3 London, England London England
4 California, USA California USA
5 USA NaN USA
