I have a dataset, df, where I would like to form a new column by taking first letter of column values and mapping values from additional columns within the dataset.
Data
id Date Model type status
aa Q1 2022 orion hi01 o_hi
aa Q1 2022 orion hi02 o_hi
aa Q1 2022 orion hello01 o_hello
aa Q1 2022 orion go01 o_go
bb Q1 2022 orion hi01 o_hi
bb Q1 2022 orion hi02 o_hi
bb Q1 2022 orion hi03 o_hi
Desired
id Date Model type status NEW
aa Q1 2022 orion hi01 o_hi aa o_hi01 Q1 2022
aa Q1 2022 orion hi02 o_hi aa o_hi02 Q1 2022
aa Q1 2022 orion hello01 o_hello aa o_hello01 Q1 2022
aa Q1 2022 orion go01 o_go aa o_go01 Q1 2022
bb Q1 2022 orion hi01 o_hi bb o_hi01 Q1 2022
bb Q1 2022 vorion hi02 v_hi bb v_hi02 Q1 2022
bb Q1 2022 vorion hi03 v_hi bb v_hi03 Q1 2022
Doing
df['NEW'] = df['id'].map(str) ' ' df['Model'].astype(str).str[0] df['type'].map(str) ' ' df['Date'].map(str)
However, I am troubleshooting on how to incorporate the first letter of column values within the newly formed column.
Believe I can incorporate .loc
df['Model'].astype(str).str[0]
Any suggestion is appreciated.
CodePudding user response:
You don't need to use .map(str) because all of your fields are already strings. If needed use df[column].astype(str) instead of df[column].map(str).
df['New'] = df['id'] ' ' df['Model'].str[0] '_' df['type'] ' ' df['Date']
print(df)
# Output
id Date Model type status New
0 aa Q1 2022 orion hi01 o_hi aa o_hi01 Q1 2022
1 aa Q1 2022 orion hi02 o_hi aa o_hi02 Q1 2022
2 aa Q1 2022 orion hello01 o_hello aa o_hello01 Q1 2022
3 aa Q1 2022 orion go01 o_go aa o_go01 Q1 2022
4 bb Q1 2022 orion hi01 o_hi bb o_hi01 Q1 2022
5 bb Q1 2022 vorion hi02 o_hi bb v_hi02 Q1 2022
6 bb Q1 2022 vorion hi03 o_hi bb v_hi03 Q1 2022
CodePudding user response:
Try this:
df['NEW'] = df['id'].map(str) ' ' df['status'].astype(str) df.groupby(['id', 'status']).cumcount().add(1).astype(str).str.zfill(2) ' ' df['Date']
Output:
>>> df
id Date type status NEW
0 aa Q1 2022 hi01 o_hi aa o_hi01 Q1 2022
1 aa Q1 2022 hi02 o_hi aa o_hi02 Q1 2022
2 aa Q1 2022 hello01 o_hello aa o_hello01 Q1 2022
3 aa Q1 2022 go01 o_go aa o_go01 Q1 2022
4 bb Q1 2022 hi01 o_hi bb o_hi01 Q1 2022
5 bb Q1 2022 hi02 o_hi bb o_hi02 Q1 2022
6 bb Q1 2022 hi03 o_hi bb o_hi03 Q1 2022
CodePudding user response:
df['NEW'] = df['id'].map(str) ' ' df['Model'].astype(str).str[0] '_' df['type'].map(str) ' ' df['Date'].map(str)
