Home > Mobile >  Turn 3 dimensional goupedby dataframe into a barchart differentiated by color based on one of the co
Turn 3 dimensional goupedby dataframe into a barchart differentiated by color based on one of the co

Time:02-04

My original data looks something like this:

Day  Time      Type
3    21:00       0
3    21:00       0
3    22:00       0
4    21:00       0
3    21:00       1
3    22:00       1
4    22:00       1
3    21:00       2
4    22:00       2
4    22:00       2

While this is the resulting grouped data I have

Type    Day    Hour
0       3    21    2
             22    1
        4    21    1
1       3    21    1
             22    1
        4    22    1
2       3    21    1
        4    22    2

Imagine this being knocks on my front, backdoor and side door. Where 0 is front and 1 the backdoor and 2 being a side door. And it shows me the amount of knocks on each door per day and hour. It always displays the sum at the right

I want this data now to be shown in a barchart where data from the same day and hour gets stacked upon of each other just havin different color based on the type they are

Example of how it should look in the end

This represents of what I am looking for. I was playing around with matplotlib but I just cant seem to do it. Hope someone can help

Edit: Here my groupby code

time_data_station1 = df.groupby([df["Type"], df["CREATION TIME"].dt.day, df["CREATION TIME"].dt.hour]).size()

CodePudding user response:

IIUC,

df = pd.DataFrame({'knocks':[2,3,5,1,2,2,3,5,3]},
                 index=pd.MultiIndex.from_arrays([[0,0,0,1,1,1,2,2,2],
                                                  [3.,3.,4.,3.,4.,4.,3.,3.,4.],
                                                  [21,22,21,21,21,22,21,22,21]]))


dfu=df.unstack(0)['knocks']
dfu.index = [f'Day {i} - Hour {j}' for i, j in dfu.index]

ax  = dfu.plot.bar(stacked=True, figsize=(12,8), rot=0)
ax.legend(title='Door');
ax.set_ylabel('Number of Knocks');
ax.set_title('Daily Hourly Knocks by Door');

Output:

enter image description here

CodePudding user response:

Okay this is what I did:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

df = pd.DataFrame({"Day":[3,3,3,4,4,4], "Hour":[21,21,22,21,21,22], "Type":[0,1,0,0,1,0], "Amount":[2,1,3,5,2,2]})

first = df[df["Type"] == 0].sort_values(["Day", "Hour"])
first.reset_index(inplace=True, drop=True)
second = df[df["Type"] == 1].sort_values(["Day", "Hour"])
second.reset_index(inplace=True, drop=True)
#third = df[df["Type"] == 2].sort_values(by=["Day", "Hour"])
#third.reset_index(inplace=True, drop=True)
#fourth = df[df["Type"] == 3].sort_values(by=["Day", "Hour"])
#fourth.reset_index(inplace=True, drop=True)
total = df.groupby(["Day","Hour"]).sum()
total.reset_index(inplace=True)

for i in [first,second]: # [first,second,third,fourth] modify this to increase the number of stacked elements
    ax = sns.barplot(x=total.index, y=total["Amount"], color="#09b0a8", alpha=0.3) #alpha makes color stacking easier
    for j in i.values:
        for num, val in enumerate(total.values):
            if val[0] == j[0] and val[1] == j[1]:
                total["Amount"].loc[num] = total["Amount"].loc[num] - j[3] #calculate height differential

plt.show()

This is a very manual way of solving the problem. There's definitely a library for this.

  •  Tags:  
  • Related