I have one Dataframe1 from which I need to form new Dataframe2 as given below. Column n1 will pick the value from Status column in Dataframe1 if Dataframe1.name = A and similarly column n2 will take the value from column Status if Dataframe1.Name = B. Also, Timestamp and id will have unique values. Can anybody please help?
Input Dataframe1:
| id | Timestamp | Name | Status |
|---|---|---|---|
| 1 | 02:15:00 | A | FALSE |
| 1 | 02:15:00 | B | TRUE |
| 2 | 03:00:00 | A | TRUE |
| 2 | 03:00:00 | B | FALSE |
Output Dataframe2:
| id | Timestamp | n1 | n2 |
|---|---|---|---|
| 1 | 02:15:00 | FALSE | TRUE |
| 2 | 03:00:00 | TRUE | FALSE |
CodePudding user response:
What you are trying to do is taking a pivot of the data with special names. If you rename A and b values as you want to n1 and n2 only thing you have to do is to use the pandas.pivot_table function.
because as its aggregation function it uses mean strings don't work out of the box. You have to provide your own aggregation function. Because in our situation every row is unique we can just give the aggregation function to take the value of that row.
dataframe1['Name'] = dataframe1['Name'].replace({'A': 'n1', 'b': 'n2'})
dataframe1.pivot_table(index=['id', 'Timestamp'],
columns='Name',
values='Status',
aggfunc=lambda x:x).reset_index()
CodePudding user response:
You can use pandas.pivot_table:
df2 = df.pivot_table(index=['id','Timestamp'], columns='Name', values='Status').reset_index().set_index('id')
df2.columns = ['Timestamp','n1','n2']
Output:
>>> df2
Timestamp n1 n2
id
1 02:15:00 FALSE TRUE
2 03:00:00 TRUE FALSE
CodePudding user response:
using pivot_table and then adjusting the result header.
import pandas as pd
df = pd.read_excel('test.xls', index_col = False)
df2 = df.pivot_table(index = ['id', 'Timestamp'], columns = 'Name', values = 'Status').reset_index().rename_axis(None, axis=1).rename(columns = {'A': 'n1', 'B': 'n2'})
print(df2)
