Given the following DataFrame of pandas in Python:
Displays the switching on and off of 3 bulbs at different times. Using datetime objects.
date code other time
1 2022-02-27 15:30:21 00:00 5 ON NaT
2 2022-02-29 17:05:21 00:00 5 OFF 2 days 01:35:00
3 2022-04-07 17:05:21 00:00 5 OFF NaT
4 2022-04-06 16:10:21 00:00 4 ON NaT
5 2022-04-07 15:30:21 00:00 4 OFF 0 days 23:20:00
6 2022-02-03 22:40:21 00:00 3 ON NaT
7 2022-02-03 23:20:21 00:00 3 OFF 0 days 00:40:00
8 2022-02-04 00:20:21 00:00 3 ON NaT
9 2022-02-04 14:30:21 00:00 3 ON NaT
10 2022-01-31 15:30:21 00:00 3 ON NaT
11 2022-02-04 15:35:21 00:00 3 OFF 4 days 00:05:00
12 2022-02-04 15:40:21 00:00 3 OFF NaT
13 2022-02-04 19:40:21 00:00 3 ON NaT
14 2022-02-06 15:35:21 00:00 3 OFF 1 days 19:55:00
15 2022-02-23 21:10:21 00:00 3 ON NaT
16 2022-02-24 07:10:21 00:00 3 OFF 0 days 10:00:00
I want to add a new column, called nights. This column will include only for rows where the variable time is different from NaT. Information on how many nights the light bulb has been on. The night period is defined as 22:00:00 to 05:00:00.
Example of the resulting DataFrame:
date code other time nights
1 2022-02-27 15:30:21 00:00 5 ON NaT 0
2 2022-02-29 17:05:21 00:00 5 OFF 2 days 01:35:00 2
3 2022-04-07 17:05:21 00:00 5 OFF NaT 0
4 2022-04-06 16:10:21 00:00 4 ON NaT 0
5 2022-04-07 15:30:21 00:00 4 OFF 0 days 23:20:00 1
6 2022-02-03 22:40:21 00:00 3 ON NaT 0
7 2022-02-03 23:20:21 00:00 3 OFF 0 days 00:40:00 0
8 2022-02-04 00:20:21 00:00 3 ON NaT 0
9 2022-02-04 14:30:21 00:00 3 ON NaT 0
10 2022-01-31 15:30:21 00:00 3 ON NaT 0
11 2022-02-04 15:35:21 00:00 3 OFF 4 days 00:05:00 4
12 2022-02-04 15:40:21 00:00 3 OFF NaT 0
13 2022-02-04 19:40:21 00:00 3 ON NaT 0
14 2022-02-06 15:35:21 00:00 3 OFF 1 days 19:55:00 2
15 2022-02-23 21:10:21 00:00 3 ON NaT 0
16 2022-02-24 07:10:21 00:00 3 OFF 0 days 10:00:00 1
The light bulb ON register is just before the non-NaT value of the time variable.
Information added: In case the bulb is switched off and on in the middle of the night, it will not be taken into account for the variable nights. It has to spend the entire interval switched on.
date code other time nights
1 2022-02-27 21:00:00 00:00 1 ON NaT 0
2 2022-02-28 01:00:00 00:00 1 OFF 0 days 04:00:00 0
3 2022-02-28 03:15:00 00:00 1 ON NaT 0
4 2022-02-28 09:30:00 00:00 1 OFF 0 days 06:15:00 0
CodePudding user response:
Not knowing how you want to handle partial nights you could try something like this.... [Updated to handle partial days better] [Update to add imports]
import pandas as pd
import datetime as dt
df['edate'] = df.date df.time
df['first_night_start'] = pd.to_datetime(df.date.dt.date) dt.timedelta(hours=22)
df['first_night_end'] = df['first_night_start'] dt.timedelta(hours=7)
def get_nights(ser):
if pd.isna(ser.time):
return
days = ser.time.days
the_rest = ser.time - dt.timedelta(days)
if ((ser.date the_rest) > ser.first_night_end) and (ser.date < ser.first_night_start):
return days 1
else:
return days
df['nights'] = df.apply(get_nights, axis=1)
