I'm trying to sample 1000 unique users within a data. These can be any 1000 users. But I want to extract all rows for the 1000 unique users.
Input
| User_ID | Ship Date |
|---|---|
| A454 | 8/2/2019 |
| A454 | 9/2/2019 |
| G658 | 9/2/2019 |
| G658 | 9/2/2019 |
from random import sample
df['User_ID'].sample(n=1000, random_state=1)
I tried the above code, but this just gives the unique IDs and not all rows for 1000 unique users.
CodePudding user response:
IIUC, get the unique values, sample and slice with isin and boolean indexing:
from random import sample
out = df[df['User_ID'].isin(random.sample(list(df['User_ID'].unique()), 1000))]
