I have a dataframe
df_in = pd.DataFrame([["A","X",5,4,1],["B","Y",3,3,1],["C","Y",4,7,4]], columns=['col1', 'col2', 'col3', 'col4','col5'])
I want to repeat a row n no. of times and the count also should increase from the no. present in col4.
Ex: I want to repeat the B row 3 times and count in col4 will increse from the current value present in col4 like 3,4 and 5. Similarly for C row repeat 2 times and increase the count in col4 from the current value.
Expected Output:
df_Out = pd.DataFrame([["A","X",5,4,1],["B","Y",3,3,1],["B","Y",3,4,1],["B","Y",3,5,1],["C","Y",4,7,4],["C","Y",4,8,4]], columns=['col1', 'col2', 'col3', 'col4','col5'])
How to do it?
CodePudding user response:
Create dictionary for number of repeating, map by Series.map and if no match set 1, then use Index.repeat for index values with DataFrame.loc for append rows, last add counter by GroupBy.cumcount for col4:
d = {'B':3, 'C':2}
df = df_in.loc[df_in.index.repeat(df_in['col1'].map(d).fillna(1))]
df['col4'] = df.groupby(level=0).cumcount()
df = df.reset_index(drop=True)
print (df)
col1 col2 col3 col4 col5
0 A X 5 4 1
1 B Y 3 3 1
2 B Y 3 4 1
3 B Y 3 5 1
4 C Y 4 7 4
5 C Y 4 8 4
