I have this two dataframes I need to relate.
The first one, HOLIDAYS, gives me local holiday dates and the stores code in which they're celebrated
| Holiday date | Store code |
|---|---|
| 01/02 | 18005 |
| 01/02 | 18032 |
| ... | ... |
| 31/03 | 18043 |
The second one, BALANCE, shows balance of stores in certain dates with date and number stores as index.
| balance | ||
|---|---|---|
| 01/02 | 18001 | $35,00 |
| 01/02 | 18002 | $38,00 |
| ... | ... | ... |
| 31/03 | 18099 | $20,45 |
What I need to do is to create a column in BALANCE named Holiday with a boolean value showing if a certain row is showing a balance obtained during a holiday or not.
I tried to create the column 'Holiday' setting the initial value as False and then assigning every value of HOLIDAY in index of BALANCE dataframe to True, I'm getting ValueError (possibly because a dataframe cannot be passed as index of other). I tried to convert HOLIDAY to MultiIndex but again it's not working.
BALANCE['Holiday'] = False
H = pd.MultiIndex.from_frame(HOLIDAY)
BALANCE.loc[H, 'Holiday'] = True
I'm pretty sure this should not be difficult but I'm out of ideas now. There is any way I could work with the first dataframe as MultiIndex of the second?
CodePudding user response:
Your example doesn't have any rows which match, but this should work:
HOLIDAYS['is_holiday'] = True
res = pd.merge(BALANCE,
HOLIDAYS,
how='left',
left_index=True,
right_on=['Holiday_date', 'Store_code'])
res['is_holiday'] = res['is_holiday'].fillna(False)
