In this question (Get year, month or day from numpy datetime64) an example on how to get year, month and day from a numpy datetime64 can be found.
One of the answers uses:
dates = np.arange(np.datetime64('2000-01-01'), np.datetime64('2010-01-01'))
years = dates.astype('datetime64[Y]').astype(int) 1970
months = dates.astype('datetime64[M]').astype(int) % 12 1
days = dates - dates.astype('datetime64[M]') 1
Also notice that:
To get integers instead of timedelta64[D] in the example for days above, use: (dates - dates.astype('datetime64[M]')).astype(int) 1
How could the hours, minutes and seconds be extracted?
As stated in the comment to return integers, I would like to get integers too.
Edit:
Jérôme's answer is useful but I am still struggling to properly understand how do I reach the safe point of having datetime64[s] as input data.
In my actual situation this is what I have once I read the CSV in Pandas:
print(df['date'])
print(type(df['date']))
print(df['date'].dtype)
0 2018-12-31 23:59:00
1 2018-12-31 23:58:00
2 2018-12-31 23:57:00
3 2018-12-31 23:56:00
4 2018-12-31 23:55:00
...
525594 2018-01-01 00:05:00
525595 2018-01-01 00:04:00
525596 2018-01-01 00:03:00
525597 2018-01-01 00:02:00
525598 2018-01-01 00:01:00
Name: date, Length: 525599, dtype: object
<class 'pandas.core.series.Series'>
object
So how could I convert df['dates'] into a dates variable which is datetime64[s] and then apply the solution provided?
CodePudding user response:
In your example, the type of the array is np.datetime64[D] so the hours/minutes/seconds are not stored in the items. However, the np.datetime64[s] does this.
Here is how to extract the information from a np.datetime64[s]-typed array:
# dates = array(['2009-08-29T23:44:31',
# '2017-12-17T05:47:37'],
# dtype='datetime64[s]')
dates = np.array([
np.datetime64(1251589471, 's'),
np.datetime64(1513489657, 's')
])
Y, M, D, h, m, s = [dates.astype('datetime64[%s]' % kind) for kind in 'YMDhms']
years = Y.astype(int) 1970
months = M.astype(int) % 12 1
days = (D - M).astype(int) 1
hours = (h - D).astype(int)
minutes = (m - h).astype(int)
seconds = (s - m).astype(int)
# [array([2009, 2017]),
# array([ 8, 12], dtype=int32),
# array([29, 17]),
# array([23, 5]),
# array([44, 47]),
# array([31, 37])])
print([years, months, days, hours, minutes, seconds])
