I have two data frames given below. I want to subtract the rows of df1 by rows of df2 based on specific date values. Here, in the example, I have given the output based on date 2020-02-20. The rows after a specific date (i.e. 2020-02-20) in df1 are subtracted by rows after the specific date (i.e. 2020-02-20) from df2.
df1:
| id || date | | 00:00:00 | | 00:15:00 | | 00:30:00 |.......| 00:23:45 |
| ---- || -------- | | -------- | | -------- | | -------- |.......| -------- |
| ID1 || 2020-01-01 | | 0.15 | | 0.19 | | 0.18 |.......| 0.12 |
| ID1 || 2020-01-02 | | 0.12 | | 0.12 | | 0.13 |.......| 0.09 |
| ID1 || 2020-01-03 | | 0.11 | | 0.10 | | 0.11 |.......| 0.10 |
| . || . | | . | | . | | . |.......| . |
| . || . | | . | | . | | . |.......| . |
| ID1 || 2020-02-20 | | 0.12 | | 0.10 | | 0.15 |.......| 0.13 |
| . || . | | . | | . | | . |.......| . |
| . || . | | . | | . | | . |.......| . |
| ID1 || 2020-12-31 | | 0.14 | | 0.16 | | 0.18 |.......| 0.15 |
df2:
| date | | 00:00:00 | | 00:15:00 | | 00:30:00 |.......| 00:23:45 |
| -------- | | -------- | | -------- | | -------- |.......| -------- |
| 2020-01-01 | | 0 | | 0 | | 0.08 |.......| 0.06 |
| 2020-01-02 | | 0 | | 0 | | 0.09 |.......| 0.02 |
| 2020-01-03 | | 0 | | 0 | | 0.05 |.......| 0.08 |
| . | | . | | . | | . |.......| . |
| . | | . | | . | | . |.......| . |
| 2020-02-20 | | 0 | | 0 | | 0.10 |.......| 0.06 |
| . | | . | | . | | . |.......| . |
| . | | . | | . | | . |.......| . |
| 2020-12-31 | | 0 | | 0.04 | | 0.08 |.......| 0.07 |
Output dataframe (df_out) should be:
| id || date | | 00:00:00 | | 00:15:00 | | 00:30:00 |.......| 00:23:45 |
| ---- || -------- | | -------- | | -------- | | -------- |.......| -------- |
| ID1 || 2020-01-01 | | 0.15 | | 0.19 | | 0.18 |.......| 0.12 |
| ID1 || 2020-01-02 | | 0.12 | | 0.12 | | 0.13 |.......| 0.09 |
| ID1 || 2020-01-03 | | 0.11 | | 0.10 | | 0.11 |.......| 0.10 |
| . || . | | . | | . | | . |.......| . |
| . || . | | . | | . | | . |.......| . |
| ID1 || 2020-02-20 | | 0.12 | | 0.10 | | 0.05 |.......| 0.07 |
| . || . | | . | | . | | . |.......| . |
| . || . | | . | | . | | . |.......| . |
| ID1 || 2020-12-31 | | 0.14 | | 0.12 | | 0.10 |.......| 0.08 |
CodePudding user response:
If the indexes of dataframes are equal, you can simply subtract them:
output = df - df2
to set an index, use set_index:
df.set_index('date', inplace=True)
to only subtract a part of one dataframe from another, slice the dataframes:
df.loc[df.index > '2020-02-20']
Total solution would be something like this:
df.set_index('date', inplace=True)
df2.set_index('date', inplace=True)
df.loc[df.index >= "2020-02-20", df.columns != 'id'] = df.loc[df.index >= "2020-02-20", df.columns != 'id'] - df2.loc[df2.index >= "2020-02-20"]
