I have a dataframe like below.
|items(index)|event|day|
|---- |------|-----|
|1000|[buy, buy]|[2019-03-26, 2019-04-26]|
|1001|[buy, sell]|[2019-05-26, 2019-06-26]|
|1002|[buy, sell]|[2019-04-21, 2019-04-29]|
|1003|[sell, buy]|[2019-03-21, 2019-04-23]|
The order of values in the list of event columns corresponds to the order of values in the list of day columns; thus, 1000's first buy action occurred 2019-03-26.
The problem is that I want the Datetime to be the index and the current index to be the column, so it looks like the following.
|day(index)|1000|1001|1002|1003|
|----|------|----- |-----|-----|
|2019-03-21| | | |sell |
|2019-03-26|buy| | | |
|2019-04-21| | |buy | |
|2019-04-23| | | | |
|2019-04-26|buy| | | |
|2019-04-29| | |sell | |
|2019-05-26| |buy| | |
|2019-06-26| |sell| | |
I don't know how I can effectively convert the above dataframe.
CodePudding user response:
Assuming your df to be:
In [354]: df = pd.DataFrame({'items':[1000, 1001, 1002, 1003], 'event':[['buy', 'buy'], ['buy', 'sell'], ['buy', 'sell'], ['sell', 'buy']], 'day':[['2019-03-26', '2019-04-26'], ['2019-05-26', '2019-06-26'], ['2019-04-21',
...: '2019-04-29'], ['2019-03-21', '2019-04-23']]})
In [355]: df
Out[355]:
items event day
0 1000 [buy, buy] [2019-03-26, 2019-04-26]
1 1001 [buy, sell] [2019-05-26, 2019-06-26]
2 1002 [buy, sell] [2019-04-21, 2019-04-29]
3 1003 [sell, buy] [2019-03-21, 2019-04-23]
Use df.apply with df.pivot and df.fillna:
In [357]: x = df.set_index('items').apply(pd.Series.explode).reset_index()
In [360]: x.pivot('day', 'items', 'event').fillna('')
Out[360]:
items 1000 1001 1002 1003
day
2019-03-21 sell
2019-03-26 buy
2019-04-21 buy
2019-04-23 buy
2019-04-26 buy
2019-04-29 sell
2019-05-26 buy
2019-06-26 sell
CodePudding user response:
IIUC, try this:
import pandas as pd
df = pd.DataFrame({'items':[1000,1001,1002,1003],
'event':[['buy', 'buy'],
['buy', 'sell'],
['buy', 'sell'],
['sell', 'buy']],
'day':[['2019-03-26', '2019-04-26'],
['2019-05-26', '2019-06-26'],
['2019-04-21', '2019-04-39'],
['2019-03-21', '2019-04-23']]})
dfe = df.explode(['event', 'day'])
dfe.pivot('day', 'items', 'event').fillna('')
Output:
items 1000 1001 1002 1003
day
2019-03-21 sell
2019-03-26 buy
2019-04-21 buy
2019-04-23 buy
2019-04-26 buy
2019-04-39 sell
2019-05-26 buy
2019-06-26 sell

