I have a dataset which I grouped by using 2 columns. Now I want to plot the graph for top N based on 1 column. To explain it better below are the example data set. This data set is created from main data set using groupby
| Data1 | Data2 | Value |
|---|---|---|
| A | x | 6 |
| A | y | 7 |
| A | z | 8 |
| B | y | 3 |
| B | z | 4 |
| B | u | 5 |
| C | x | 6 |
| C | y | 7 |
| C | v | 8 |
| D | v | 4 |
| D | y | 5 |
| D | z | 7 |
| E | t | 8 |
| E | u | 7 |
| E | x | 6 |
| F | s | 4 |
| F | s | 5 |
| F | r | 6 |
Now I want only top 3 data1 to create new data set and to plot the seaborn graph. Below is the desire result.
| Data1 | Data2 | Value |
|---|---|---|
| A | x | 6 |
| A | y | 7 |
| A | z | 8 |
| B | y | 3 |
| B | z | 4 |
| B | u | 5 |
| C | x | 6 |
| C | y | 7 |
| C | v | 8 |
CodePudding user response:
IIUC, you want to keep the first N groups of Data1?
You can use unique and slice it to get the first N groups in order, then use boolean indexing:
N = 3
out = df[df['Data1'].isin(df['Data1'].unique()[:N])]
Other option using itertools.islice and pandas.concat on the groupby (less efficient):
from itertools import islice
out = pd.concat([g for _,g in islice(df.groupby('Data1'), 3)])
output:
Data1 Data2 Value
0 A x 6
1 A y 7
2 A z 8
3 B y 3
4 B z 4
5 B u 5
6 C x 6
7 C y 7
8 C v 8
