Situation
I'm having a table with the most important features from a random forrest model. This table looks like:
I used a screenshot as it the easiest way to get an overview, below you'll find a MWE.
My goal is to plot for each Label (the column which is named "Label") a barplot.
Problem
If we melt the table and have a look at the label p24D...
...we see that p24D in combination with m24U shows a value of 0. But if I create my plot it looks like:
We see in the p24D-facet a bar that is greater than 0 which is labeled with m24U.
Conclusion
For me it seems like it takes the correct bar heights and distribution for each group, but the x-labels are wrong, as they are taken from the last plotted facet.
Question
I want to keep the same labeling for all facets, but the bars should be assigned to the correct labels. How can I do this?
MWE
The dataframe and the code for the plot
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df_imp = pd.DataFrame({'m12D': {0: 0.25987975843758654, 1: 0.18727707779383243, 2: 0.39100295701375354, 3: 0.06384800127268568,
4: 0.07999849502412754, 5: 0.13640019148970256, 6: 0.1412876367877005, 7: 0.09121120297702168,
8: 0.0, 9: 0.0},
'm01D': {0: 0.0724816118081828, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.06251755368842625, 6: 0.0,
7: 0.0, 8: 0.0, 9: 0.0},
'm06D': {0: 0.06145081213633308, 1: 0.15137716985355018, 2: 0.10217899239161463, 3: 0.0,
4: 0.0, 5: 0.0, 6: 0.06612051972885429, 7: 0.09406588670435026, 8: 0.0, 9: 0.0},
'm12U': {0: 0.047766259908712215, 1: 0.11029620061232079, 2: 0.058189273034798476, 3: 0.0,
4: 0.0, 5: 0.13754236428929292, 6: 0.24967144685607753, 7: 0.3109784881004455,
8: 0.07243867278541272, 9: 0.06597783007344389},
'm05D': {0: 0.04166653999225189, 1: 0.07952487761091377, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0,
6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'Label': {0: 'p03D', 1: 'p06D', 2: 'p12D', 3: 'p24D', 4: 'p48D', 5: 'p03U', 6: 'p06U',
7: 'p12U', 8: 'p24U', 9: 'p48U'},
'down': {0: 0.0, 1: 0.05780607422803258, 2: 0.0, 3: 0.06127000511754594, 4: 0.05020367686447687,
5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'm06U': {0: 0.0, 1: 0.0, 2: 0.08466651640727402, 3: 0.0, 4: 0.0, 5: 0.05309415831396426,
6: 0.10234265392288792, 7: 0.10916311424468256, 8: 0.039101554776822, 9: 0.04173916824046613},
'p06T': {0: 0.0, 1: 0.0, 2: 0.056553945901284826, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0,
8: 0.0, 9: 0.0},
'm24D': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.05209052270599571, 4: 0.06155322163002999, 5: 0.0,
6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'p24T': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.05411155410020583, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0,
8: 0.0, 9: 0.0},
'wday_6': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.059271604760677284, 4: 0.0, 5: 0.0,
6: 0.0, 7: 0.0, 8: 0.035176474643147146, 9: 0.0},
'p48T': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.04623563899446924, 5: 0.0, 6: 0.0, 7: 0.0,
8: 0.0, 9: 0.0},
'wday_4': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.04342742013282083, 5: 0.0, 6: 0.0,
7: 0.0, 8: 0.0, 9: 0.0},
'm02D': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.049231996352442,
6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'up_sum': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.05893778686091927,
7: 0.0, 8: 0.0667566604504024, 9: 0.09670821121474267},
'm05U': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0590711833092717,
8: 0.0, 9: 0.0},
'm24U': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0,
7: 0.0, 8: 0.0844568188980733, 9: 0.11262510249213625},
'wday_5': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0,
9: 0.03522099307752065}})
df_imp2 = df_imp.fillna(0)
df_imp2 = df_imp2.melt(id_vars="Label")
df_imp2 = df_imp2.rename(columns={"value":"Importance", "variable":"Feature"})
sns.set_theme(style="white")
g = sns.FacetGrid(df_imp2, col="Label", height=1.5, aspect=5, col_wrap=2, margin_titles=True, despine=False)
g.map(sns.barplot, "Feature", "Importance", order=['m12D', 'm01D', 'm06D', 'm12U', 'm05D', 'down', 'm06U', 'p06T',
'm24D', 'p24T', 'wday_6', 'p48T', 'wday_4', 'm02D', 'up_sum', 'm05U',
'm24U', 'wday_5']))
g.figure.subplots_adjust(wspace=0.02, hspace=0.4)
g.set(yticks=np.arange(0,0.4,0.1))
g.set_xticklabels(rotation=30)
plt.show()
CodePudding user response:
so how about you sort your data first before plotting them :
df_imp2 = df_imp.fillna(0)
df_imp2 = df_imp2.melt(id_vars="Label")
df_imp2 = df_imp2.rename(columns={"value":"Importance", "variable":"Feature"})
df_imp2 = df_imp2.sort_values(['Label','Feature']) # sort them before plot
sns.set_theme(style="white")
g = sns.FacetGrid(df_imp2, col="Label", height=1.5, aspect=5, col_wrap=2, margin_titles=True, despine=False)
g.map(sns.barplot, "Feature", "Importance")
g.figure.subplots_adjust(wspace=0.02, hspace=0.4)
g.set(yticks=np.arange(0,0.4,0.1))
g.set_xticklabels(rotation=30)


