I managed to group a dataset by multiple columns and apply an aggregate method on a different column to get the mean of the yearly Sales for each Country:
mean_sales = df.groupby(by=['Country', 'Year'])['Sales'].mean()
And this looks like the result I've got after the group by.
| Country | Year | |
|---|---|---|
| UK | 2009 | 2 |
| 2010 | 3 | |
| 2011 | 5 | |
| Spain | 2009 | 5 |
| 2010 | 6 | |
| 2011 | 7 | |
| Germany | 2009 | 2 |
| 2010 | 4 | |
| 2011 | 8 | |
| Italy | 2009 | 6 |
| 2010 | 8 | |
| 2011 | 9 |
I would like to obtain individual line plots, one for each country, with the values of the mean (on the y-axis) for the different years (on the x-axis).
I have tried several options found in previous discussions but none of those works with what I want to achieve.
CodePudding user response:
So there's probably an easy way to do it using Pandas' native plotting functionality, but otherwise it's also easy to do using Seaborn.
Note - the result of the groupby operation as currently written is a Series, so pass in index=False to the groupby function and you'll get a DataFrame back.
import seaborn as sns
sns.lineplot(data=df, x="Year", y="Sales", hue="Country")
Also, one point of clarification: if you actually want them on separate plots, and not just separate lines within a single plot, you can do something like this:
fig, ax = plt.subplots(nrows=df["Country"].nunique(), sharex=True, figsize=(8, 12))
fig.tight_layout()
for idx, country in enumerate(df["Country"].unique()):
country_df = df.loc[df["Country"] == country]
sns.lineplot(data=country_df, x="Year", y="Sales", ax=ax[idx])
ax[idx].title.set_text(country)
``
CodePudding user response:
Another way to plot them in the same graph is to use the sns.catplot. If wanting them on separate graphs but comparable to each other is to use the the sns.facetgrid profile where you could make the row equivalent to Country and column to years.
g = sns.FacetGrid(mean_sales, col="Year", row="Country") g.map_dataframe(sns.barplot, x="Year")
or g = sns.catplot(x="Year", col="Country", col_wrap=3, data=mean_sales,kind="bar", height=2.5, aspect=.8)
