I would like to understand if I can shape a DataFrame to a multi-index and multi-header/multi-column (pivot) DataFrame without aggregation since this aggregation calculation is already present on the columns of my DataFrame.
I have the following DataFrame:
card_type payment_status airbnb paid revenue - sum revenue - min debit - sum
American Express Checked Out Premium Queen Ensuite No 591.49 0.0 2
American Express Checked Out Queen Room w. Shared Facilities No 255.52 0.0 2
American Express Checked Out Single Room w. Shared Facilities No 1602.02 0.0 5
American Express Confirmed Compact Double Room w. Shared Facilities No 189.05 0.0 1
American Express Confirmed Premium Queen Ensuite No 350.0 0.0 1
American Express Confirmed Queen Room w. Shared Facilities Yes 110.53 0.0 1
American Express Confirmed Single Room w. Shared Facilities No 4258.48 0.0 3
Mastercard Cancelled Queen Room w. Shared Facilities No 28.5 0.0 3
Mastercard Cancelled Single Room w. Shared Facilities Yes 578.55 0.0 2
Mastercard Checked Out Compact Double Room w. Shared Facilities No 4637.71 0.0 22
...
df = pd.DataFrame.from_dict({
'card_type': {0: 'American Express', 1: 'American Express', 2: 'American Express', 3: 'American Express', 4: 'American Express', 5: 'American Express', 6: 'American Express', 7: 'Mastercard', 8: 'Mastercard', 9: 'Mastercard'},
'payment_status': {0: 'Checked Out', 1: 'Checked Out', 2: 'Checked Out', 3: 'Confirmed', 4: 'Confirmed', 5: 'Confirmed', 6: 'Confirmed', 7: 'Cancelled', 8: 'Cancelled', 9: 'Checked Out'},
'airbnb': {0: 'Premium Queen Ensuite ', 1: 'Queen Room w. Shared Facilities ', 2: 'Single Room w. Shared Facilities ', 3: 'Compact Double Room w. Shared Facilities ', 4: 'Premium Queen Ensuite ', 5: 'Queen Room w. Shared Facilities ', 6: 'Single Room w. Shared Facilities ', 7: 'Queen Room w. Shared Facilities ', 8: 'Single Room w. Shared Facilities ', 9: 'Compact Double Room w. Shared Facilities '},
'paid': {0: 'No', 1: 'No', 2: 'No', 3: 'No', 4: 'No', 5: 'Yes', 6: 'No', 7: 'No', 8: 'Yes', 9: 'No'},
'revenue - sum': {0: 591.49, 1: 255.52, 2: 1602.02, 3: 189.05, 4: 350.0, 5: 110.53, 6: 4258.48,7: 28.5, 8: 578.55, 9: 4637.71},
'revenue - min': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'debit - sum': {0: 2, 1: 2, 2: 5, 3: 1, 4: 1, 5: 1, 6: 3, 7: 3, 8: 2, 9: 22}})
I have used this method (based on 
Any way I can get around with this? Thanks!
CodePudding user response:
If you have already computed your values, you can use either:
pivot_tablewithaggfunc='first'andfill_value='_'pivotandfillna('-')
For your column levels, use reorder_levels instead of swaplevel to rearrange colimns levels using input order: levels [0, 1, 2] to [1, 2, 0]:
out = df.pivot(index=["card_type", "payment_status"],
columns=["airbnb", "paid"],
values=["revenue - sum", "revenue - min", "debit - sum"]) \
.fillna('-').reorder_levels([1, 2, 0], axis=1)
Output:
>>> out
airbnb Premium Queen Ensuite Queen Room w. Shared Facilities Single Room w. Shared Facilities ... Compact Double Room w. Shared Facilities Queen Room w. Shared Facilities Single Room w. Shared Facilities
paid No No No ... No Yes Yes
revenue - sum revenue - sum revenue - sum ... debit - sum debit - sum debit - sum
card_type payment_status ...
American Express Checked Out 591.49 255.52 1602.02 ... - - -
Confirmed 350.0 - 4258.48 ... 1.0 1.0 -
Mastercard Cancelled - 28.5 - ... - - 2.0
Checked Out - - - ... 22.0 - -
Update
I would like to create one more level which results from the split of values by: "-"
As you have to break some columns names into two parts, use a different strategy. First, move some columns as index of your dataframe then explode your remain columns names into multi level. Finally, unstack your airbnb and paid index levels then rearrange the order of your column levels:
out = df.set_index(['card_type', 'payment_status', 'airbnb', 'paid'])
out.columns = out.columns.str.split(' - ').map(tuple)
out = out.unstack(['airbnb', 'paid'], fill_value='-') \
.reorder_levels([2, 3, 0, 1], axis=1)
Output:
>>> out
airbnb Compact Double Room w. Shared Facilities Premium Queen Ensuite ... Queen Room w. Shared Facilities Single Room w. Shared Facilities
paid No Yes No ... Yes No Yes
revenue revenue revenue ... debit debit debit
sum sum sum ... sum sum sum
card_type payment_status ...
American Express Checked Out - - 591.49 ... - 5 -
Confirmed 189.05 - 350.0 ... 1 3 -
Mastercard Cancelled - - - ... - - 2
Checked Out 4637.71 - - ... - - -
[4 rows x 24 columns]
