At the replication of a dataframe using concat with index (see example here), is there a way I can assign a count variable for each iteration in column c (where column c is the count variable)?
Orig df:
| a | b | |
|---|---|---|
| 0 | 1 | 2 |
| 1 | 2 | 3 |
df replicated with pd.concat[df]*5 and with an additional Column c:
| a | b | c | |
|---|---|---|---|
| 0 | 1 | 2 | 1 |
| 1 | 2 | 3 | 1 |
| 0 | 1 | 2 | 2 |
| 1 | 2 | 3 | 2 |
| 0 | 1 | 2 | 3 |
| 1 | 2 | 3 | 3 |
| 0 | 1 | 2 | 4 |
| 1 | 2 | 3 | 4 |
| 0 | 1 | 2 | 5 |
| 1 | 2 | 3 | 5 |
This is a multi-row dataframe where the count variable would have to be applied to multiple rows.
Thanks for your thoughts!
CodePudding user response:
You could use np.arange and np.repeat:
N = 5
new_df = pd.concat([df] * N)
new_df['c'] = np.repeat(np.arange(N), df.shape[0]) 1
Output:
>>> new_df
a b c
0 1 2 1
1 2 3 1
0 1 2 2
1 2 3 2
0 1 2 3
1 2 3 3
0 1 2 4
1 2 3 4
0 1 2 5
1 2 3 5
