I have the following data frame.
profile_id companyName title
0 ACoAABTDOXgBMlsNG44iUdoDo0lCc1HcheRnMfg Project A Ventures CMO
1 ACoAABTDOXgBMlsNG44iUdoDo0lCc1HcheRnMfg Project A Ventures Head of Customer Acquisition
2 ACoAABTDOXgBMlsNG44iUdoDo0lCc1HcheRnMfg Facebook Client Solutions Manager (Disruptors)
3 ACoAABTDOXgBMlsNG44iUdoDo0lCc1HcheRnMfg Project A Ventures Senior Display Marketing Manager
4 ACoAABTDOXgBMlsNG44iUdoDo0lCc1HcheRnMfg Project A Ventures Display Marketing Manager
My goal is to convert every row to a single column like this. Basically, I have 5 companies and 5 titles for one profile id and I want to have a single row per profile id.

Edit: The number of companies/titles can differ per profile.
I've tried pivoting but it doesn't seem to work for this case.
Any help is appreciated.
CodePudding user response:
One solution could be as follows:
Setup
import pandas as pd
data = {'profile_id': [1, 1, 1, 2, 2],
'companyName': [*'ABCAB'],
'title': [*'XYZYZ']}
df = pd.DataFrame(data)
print(df)
profile_id companyName title
0 1 A X
1 1 B Y
2 1 C Z
3 2 A Y
4 2 B Z
Pivot
- Use
df.pivot_tablewithaggfunc=listto get eachprofile_idin a single row with the values forcompanyNameandtitleas lists. - Use the constructor
df.DataFrameon both columns (with.tolist()). This will expand the single columns into multiple columns with the individual items from the lists. Wrap this insidepd.concatto get onedfagain. - At this stage, we'll have
res.columnsasInt64Index([0, 1, 2, 0, 1, 2], dtype='int64'). I.e.0,1,2cols forcompanyName, and same fortitle. We are looking for0,0,1,1,2,2, so let's applyres = res.loc[:, res.columns.unique()]. - Finally, use
itertools.cycleto rename all the columns.
tmp = df.pivot_table(index='profile_id',values=['companyName','title'],aggfunc=list)
res = pd.concat([pd.DataFrame(tmp[col].tolist(),
index=tmp.index)
for col in tmp.columns], axis=1)
res = res.loc[:, res.columns.unique()]
from itertools import cycle
res.columns = [f'{j}{int(i) 1}' for i, j in zip(res.columns, cycle(tmp.columns))]
print(res)
companyName1 title1 companyName2 title2 companyName3 title3
profile_id
1 A X B Y C Z
2 A Y B Z None None
