insert rows to pandas Dataframe based on condition?-CodePudding

I`m using pandas dataframe to read .csv format file. I would like to insert rows when specific column values changed from value to other. My data is shown as follow:

Id   type
 1    car
 1  track
 2  train
 2  plane
 3    car

I need to add row that contains Id is empty and type value is number 4 after any change in Id column value. My desired output should like this:

Id   type
 1    car
 1  track
        4
 2  train
 2  plane
        4
 3    car

How I do this??

CodePudding user response：

You can do this:

df = pd.read_csv('input.csv', sep=";")
 Id   type
0   1    car
1   1  track
2   2  train
3   2  plane
4   3    car

mask = df['Id'].ne(df['Id'].shift(-1))
df1 = pd.DataFrame('4',index=mask.index[mask]   .5, columns=df.columns)
df1['Id'] = df['Id'].replace({'4':' '})
df = pd.concat([df, df1]).sort_index().reset_index(drop=True).iloc[:-1]

which gives:

 Id   type
0  1.0    car
1  1.0  track
2  NaN      4
3  2.0  train
4  2.0  plane
5  NaN      4
6  3.0    car

CodePudding user response：

You can do:

In [244]: grp = df.groupby('Id')
In [256]: res = pd.DataFrame()

In [257]: for x,y in grp:
     ...:     if y['type'].count() > 1:
     ...:         tmp = y.append(pd.DataFrame({'Id': [''], 'type':[4]}))
     ...:         res = res.append(tmp)
     ...:     else:
     ...:         res = res.append(y)
     ...: 

In [258]: res
Out[258]: 
  Id   type
0  1    car
1  1  track
0         4
2  2  train
3  2  plane
0         4
4  3    car

CodePudding user response：

You could use groupby to split by groups and append the rows in a list comprehension before merging again with contact:

df2 = pd.concat([d.append(pd.Series([None, 4], index=['Id', 'type']), ignore_index=True)
                 for _,d in df.groupby('Id')], ignore_index=True).iloc[:-1]

If the index is sorted, another option is to find the index of the last item per group and use it to generate the new rows:

# get index of last item per group (except last)
idx = df.index.to_series().groupby(df['Id']).last().values[:-1]

# craft a DataFrame with the new rows
d = pd.DataFrame([[None, 4]]*len(idx), columns=df.columns, index=idx)

# concatenate and reorder
pd.concat([df, d]).sort_index().reset_index(drop=True)

output:

    Id   type
0  1.0    car
1  1.0  track
2  NaN    4.0
3  2.0  train
4  2.0  plane
5  NaN    4.0
6  3.0    car

CodePudding user response：

Please find the solution below using index :

   ######  Create a shift variable to compare index
         df['idshift'] = df['Id'].shift(1)
        # When shift id does not match id, mean change index
        change_index = df.index[df['idshift']!=df['Id']].tolist()
        change_index
        # Loop through all the change index and insert at index
        for i in change_index[1:]:
           line = pd.DataFrame({"Id": ' ' , "rate": 4}, index=[(i-1) .5])
           df = df.append(line, ignore_index=False)
        # finallt sort the index 
           df = df.sort_index().reset_index(drop=True)

Input Dataframe :

df = pd.DataFrame({'Id': [1,1,2,2,3,3,3,4],'rate':[1,2,3,10,12,16,10,12]})

Ouput Results from the code :