Home > Back-end >  Explode dataset based on a specific column in Python
Explode dataset based on a specific column in Python

Time:01-21

I wish to explode my dataset based on a specific column in Python.

Data

id  type    date    stat    energy
aa  ss      Q1 2022 3       10
aa  ss      Q2 2022 2       10
bb  uu      Q1 2022 1       15
bb  uu      Q2 2022 3       15
cc  ii      Q1 2022 0       0
            

Desired

id  type    date    stat    energy
aa  ss     Q1 2022  3       10
aa  ss     Q1 2022  3       10
aa  ss     Q1 2022  3       10
aa  ss     Q2 2022  2       10
aa  ss     Q2 2022  2       10
bb  uu     Q1 2022  1       15
bb  uu     Q2 2022  3       15
bb  uu     Q2 2022  3       15
bb  uu     Q2 2022  3       15
cc  ii     Q1 2022  0       0

Doing

df.explode(list['stat'])

Any suggestion is appreciated

CodePudding user response:

Use df.index.repeat:

repeats = np.where(df['stat'] == 0, 1, df['stat'])
# OR
repeats = df['stat'].clip(lower=1)

out = df.reindex(df.index.repeat(repeats)).reset_index(drop=True)
print(out)

# Output
   id type     date  stat  energy
0  aa   ss  Q1 2022     3      10
1  aa   ss  Q1 2022     3      10
2  aa   ss  Q1 2022     3      10
3  aa   ss  Q2 2022     2      10
4  aa   ss  Q2 2022     2      10
5  bb   uu  Q1 2022     1      15
6  bb   uu  Q2 2022     3      15
7  bb   uu  Q2 2022     3      15
8  bb   uu  Q2 2022     3      15
9  cc   ii  Q1 2022     0       0

CodePudding user response:

Another solution could be

df['stat'] = [[x]*x if x > 0 else [x] for x in df['stat']]
new = df.explode('stat')
  •  Tags:  
  • Related