I have column of dates ( Dtype = datetime64[ns]) and a column of integers (Dtype = Int16) in the same dataframe (df). The integers column all have values between 20 and 100.
i want to create a new date column that is the Date column minus the integer columns i.e. df:
RepDate | Age | RevisedDate (To be calc)
1/1/19 | 20 | 1/1/99
4/2/22 | 50 | 4/2/72
so far i've tried:
df['RevisedDate'] = df['RepDate'] - pd.DateOffset(years= df['Age'])
but i get a TypeError: cannot convert the series to <class 'int'>
i've also tried deconstrucint and putting back torgether:
df['YofRevDate'] = pd.DatetimeIndex(df['RepDate']).year - df['Age']
df['MofRevDate'] = pd.DatetimeIndex(df['RepDate']).month
df['DofRevDate'] = pd.DatetimeIndex(df['RepDate']).day
df['RevisedDate'] = date(df['YofRevDate'], df['MofRevDate'], df['DofRevDate'])
but i get the same error on the last line.
your help will be much apricated
CodePudding user response:
To achieve that you have 2 ways:
- convert Age to something like timedelta in years (not supported in recent pandas versions), and add timedelta to date column
- parse date column, do age calculations, create new date column
I'm not sure how fast is my sollution (second option):
new_year = (df.RepDate.dt.year df.Age).astype(str)
month_day = df.RepDate.dt.month.astype(str) "-" df.RepDate.dt.day.astype(str)
df['NewRepDate'] = pd.to_datetime(new_year "-" month_day,format="%Y-%m-%d")
df
