I have a dataset with multiple columns out of which 2 columns I want to use to filter data as follows:
- In the Reason column keep all row values that are A3
- In the Reason column keep all values (even null) that has a date(obj) in Goods_Issue_Date_(GID)
*feel free to assign a value for the nulls in reason that have date in GID
and drop the other values like c,b..
have used this code which works for A3 values fine:
Df = Data[Data["Reason"].isna(['A3'])] ....?
df=pd.DataFrame({'Reason':{0: 'b',1: 'c',2: 'a3',3: ' ',4: ' ',5: 'a3',6: 'a3',7: ' ',8: 'b',9: ' ',}, 'Goods_Issue_Date_(GID)':{0: ' -1',1: '2 ',2: ' ',3: ' ',4: '2021-11-03T00:00:00',5: '2021-11-03T00:00:00',6: '',7: '',8: '0.5',9: '2021-11-03T00:00:00'}})
| Reason | Goods_Issue_Date_(GID) |
|---|---|
| b | -1 |
| c | 2 |
| a3 | |
| 2021-11-03T00:00:00 | |
| 2021-11-03T00:00:00 | |
| a3 | |
| a3 | |
| 0 | |
| b | 0.5 |
| 2021-11-03T00:00:00 |
CodePudding user response:
This assumes your date formats always end in 00:00
df[(df['Reason']=='a3') | (df['Goods_Issue_Date_(GID)'].str[-5:]=='00:00')]
Reason Goods_Issue_Date_(GID)
2 a3
4 2021-11-03T00:00:00
5 a3 2021-11-03T00:00:00
6 a3
9 2021-11-03T00:00:00
CodePudding user response:
df[(df['Reason']=='a3') | pd.to_datetime(df['Goods_Issue_Date_(GID)'], errors='coerce').notna()]
