Home > database >  How to substract the dataframe rows from another dataframe rows?
How to substract the dataframe rows from another dataframe rows?

Time:01-04

I have two data frames given below. I want to subtract the rows of df1 by rows of df2 based on specific date values. Here, in the example, I have given the output based on date 2020-02-20. The rows after a specific date (i.e. 2020-02-20) in df1 are subtracted by rows after the specific date (i.e. 2020-02-20) from df2.

df1:

|  id  ||    date     | | 00:00:00 | | 00:15:00 | | 00:30:00 |.......| 00:23:45 | 
| ---- ||  --------   | | -------- | | -------- | | -------- |.......| -------- |
| ID1  || 2020-01-01  | |   0.15   | |   0.19   | |   0.18   |.......|   0.12   |
| ID1  || 2020-01-02  | |   0.12   | |   0.12   | |   0.13   |.......|   0.09   |
| ID1  || 2020-01-03  | |   0.11   | |   0.10   | |   0.11   |.......|   0.10   |
|  .   ||      .      | |    .     | |    .     | |    .     |.......|    .     |
|  .   ||      .      | |    .     | |    .     | |    .     |.......|    .     |
| ID1  || 2020-02-20  | |   0.12   | |   0.10   | |   0.15   |.......|   0.13   |
|  .   ||      .      | |    .     | |    .     | |    .     |.......|    .     |
|  .   ||      .      | |    .     | |    .     | |    .     |.......|    .     |
| ID1  || 2020-12-31  | |   0.14   | |   0.16   | |   0.18   |.......|   0.15   |

df2:

|    date     | | 00:00:00 | | 00:15:00 | | 00:30:00 |.......| 00:23:45 | 
|  --------   | | -------- | | -------- | | -------- |.......| -------- |
| 2020-01-01  | |     0    | |     0    | |   0.08   |.......|   0.06   |
| 2020-01-02  | |     0    | |     0    | |   0.09   |.......|   0.02   |
| 2020-01-03  | |     0    | |     0    | |   0.05   |.......|   0.08   |
|      .      | |     .    | |     .    | |    .     |.......|    .     |
|      .      | |     .    | |     .    | |    .     |.......|    .     |
| 2020-02-20  | |     0    | |     0    | |   0.10   |.......|   0.06   |
|      .      | |     .    | |     .    | |    .     |.......|    .     |
|      .      | |     .    | |     .    | |    .     |.......|    .     |
| 2020-12-31  | |     0    | |   0.04   | |   0.08   |.......|   0.07   |

Output dataframe (df_out) should be:

|  id  ||    date     | | 00:00:00 | | 00:15:00 | | 00:30:00 |.......| 00:23:45 | 
| ---- ||  --------   | | -------- | | -------- | | -------- |.......| -------- |
| ID1  || 2020-01-01  | |   0.15   | |   0.19   | |   0.18   |.......|   0.12   |
| ID1  || 2020-01-02  | |   0.12   | |   0.12   | |   0.13   |.......|   0.09   |
| ID1  || 2020-01-03  | |   0.11   | |   0.10   | |   0.11   |.......|   0.10   |
|  .   ||      .      | |    .     | |    .     | |    .     |.......|    .     |
|  .   ||      .      | |    .     | |    .     | |    .     |.......|    .     |
| ID1  || 2020-02-20  | |   0.12   | |   0.10   | |   0.05   |.......|   0.07   |
|  .   ||      .      | |    .     | |    .     | |    .     |.......|    .     |
|  .   ||      .      | |    .     | |    .     | |    .     |.......|    .     |
| ID1  || 2020-12-31  | |   0.14   | |   0.12   | |   0.10   |.......|   0.08   |

CodePudding user response:

If the indexes of dataframes are equal, you can simply subtract them:

output = df - df2

to set an index, use set_index:

df.set_index('date', inplace=True)

to only subtract a part of one dataframe from another, slice the dataframes:

df.loc[df.index > '2020-02-20']

Total solution would be something like this:

df.set_index('date', inplace=True)
df2.set_index('date', inplace=True)
df.loc[df.index >= "2020-02-20", df.columns != 'id'] = df.loc[df.index >= "2020-02-20", df.columns != 'id'] - df2.loc[df2.index >= "2020-02-20"]
  •  Tags:  
  • Related