I'm trying to generate a list of start dates which I'll use to scrape google trends. I need the start dates 3 hours apart, and then I'll generate end dates based on the start date in 4 hour increments, so end date overlaps the next start date by 1 hour.
from datetime import datetime, timedelta, date
import pandas as pd
import time
start='2018-06-05T01'
end='2020-11-01T23'
start_date = datetime.strptime(start, '%Y-%m-%dT%H')
end_date = datetime.strptime(end, '%Y-%m-%dT%H')
delta = timedelta(hours=3)
while True:
date_list = []
date_list.append(start_date delta)
if start_date >= end:
break
This does not seem to work, and I'm not sure how to fix it since I'm not sure how to keep looping until the end date is hit.
CodePudding user response:
Since you're using pandas anyway, try with date_range:
start_date = pd.to_datetime(start, format='%Y-%m-%dT%H')
end_date = pd.to_datetime(end, format='%Y-%m-%dT%H')
date_list = pd.date_range(start_date, end_date, freq="3H")
>>> date_list
DatetimeIndex(['2018-06-05 01:00:00', '2018-06-05 04:00:00',
'2018-06-05 07:00:00', '2018-06-05 10:00:00',
'2018-06-05 13:00:00', '2018-06-05 16:00:00',
'2018-06-05 19:00:00', '2018-06-05 22:00:00',
'2018-06-06 01:00:00', '2018-06-06 04:00:00',
...
'2020-10-31 19:00:00', '2020-10-31 22:00:00',
'2020-11-01 01:00:00', '2020-11-01 04:00:00',
'2020-11-01 07:00:00', '2020-11-01 10:00:00',
'2020-11-01 13:00:00', '2020-11-01 16:00:00',
'2020-11-01 19:00:00', '2020-11-01 22:00:00'],
dtype='datetime64[ns]', length=7048, freq='3H')
If you don't want this to be a DatetimeIndex, you can use:
date_list = pd.date_range(start_date, end_date, freq="3H").tolist()
CodePudding user response:
Your code assigns an empty list to date_list and start_date is not changed in every iteration. The end variable is a string, not a datetime like end_date.
CodePudding user response:
As user5401398 pointed out, you should
- Move the
date_listoutside the loop - Update
start_datein the loop - Compare with the
end_dateinstead of theendvariable, which is a string.
A modified version is in the below.
from datetime import datetime, timedelta, date
start='2018-06-05T01'
end='2020-11-01T23'
start_date = datetime.strptime(start, '%Y-%m-%dT%H')
end_date = datetime.strptime(end, '%Y-%m-%dT%H')
delta = timedelta(hours=3)
date_list = [start_date]
while True:
start_date = delta
date_list.append(start_date)
if start_date >= end_date:
break
print(date_list)
