I have a column 'Id' that has data like this:
'10020-100-700-800-2'
How can I create a new column that would only contain the third number, in this case 700, for each row?
Here is an example dataframe:
d = {'id': {0: '10023_11_762553_762552_11', 1: '10023_14_325341_359865_14', 2: '10023_17_771459_771453_17', 3: '10023_20_440709_359899_20', 4: '10023_24_773107_625033_24', 5: '10023_27_771462_771463_27', 6: '10023_30_771262_771465_30', 7: '10023_33_761971_762470_33'}, 'values': {0: 10023, 1: 10023, 2: 10023, 3: 10023, 4: 10023, 5: 10023, 6: 10023, 7: 10023}}
CodePudding user response:
Use str.split and take the third argument of the list:
df = pd.DataFrame({'Col': ['10020-100-700-800-2']})
df['NewCol'] = df['Col'].str.split('-').str[2].astype(int)
print(df)
# Output
Col NewCol
0 10020-100-700-800-2 700
Update with your sample:
data = {'Id': ['10020-100-700-800-2',
'10022-400-900-900-2',
'10045-600-800-900-3',
'10000-300-400-300-3',
'10020-200-200-200-2'],
'Employed': [1, 0, 0, 1, 1],
'Name': ['Alan', 'Joe', 'Sam', 'Amy', 'Chloe']}
df = pd.DataFrame(data)
df['Id2'] = df['Id'].str.split('-').str[2].astype(int)
print(df)
# Output
Id Employed Name Id2
0 10020-100-700-800-2 1 Alan 700
1 10022-400-900-900-2 0 Joe 900
2 10045-600-800-900-3 0 Sam 800
3 10000-300-400-300-3 1 Amy 400
4 10020-200-200-200-2 1 Chloe 200
Update 2 with your new data
data = {'id': ['10023_11_762553_762552_11',
'10023_14_325341_359865_14',
'10023_17_771459_771453_17',
'10023_20_440709_359899_20',
'10023_24_773107_625033_24',
'10023_27_771462_771463_27',
'10023_30_771262_771465_30',
'10023_33_761971_762470_33'],
'values': [10023, 10023, 10023, 10023, 10023, 10023, 10023, 10023]}
df = pd.DataFrame(data)
df['id2'] = df['id'].str.split('_').str[2].astype(int)
print(df)
# Output
id values id2
0 10023_11_762553_762552_11 10023 762553
1 10023_14_325341_359865_14 10023 325341
2 10023_17_771459_771453_17 10023 771459
3 10023_20_440709_359899_20 10023 440709
4 10023_24_773107_625033_24 10023 773107
5 10023_27_771462_771463_27 10023 771462
6 10023_30_771262_771465_30 10023 771262
7 10023_33_761971_762470_33 10023 761971
