I have a dataframe where the index is in UTC. I want to filter the rows in Us/Eastern. The data looks like this:
open high low close volume \
timestamp
2022-01-12 14:00:00-05:00 69.485 69.56 69.445 69.540 800424
2022-01-12 14:30:00-05:00 69.530 69.60 69.480 69.550 675122
2022-01-12 15:00:00-05:00 69.555 69.57 69.455 69.505 747475
2022-01-12 15:30:00-05:00 69.500 69.67 69.410 69.560 2655043
2022-01-12 16:00:00-05:00 69.540 69.54 69.540 69.540 718856
I only want the rows for each day between some time, like 9am to 4pm EST.
This code does what I want it to do, but seems like there must be an easier more direct way.
mdf.reset_index(inplace=True)
mdf["timestamp"] = pd.to_datetime(mdf['timestamp'])
mdf['timestamp'] = mdf['timestamp'].dt.tz_convert('US/Eastern')
mdf.index = pd.DatetimeIndex(mdf.timestamp)
tdf = mdf.between_time('09:00', '16:00')
bdf = tdf.copy(deep=True)
bdf.drop(['timestamp'],axis=1, inplace=True)
When I tried between_time on the index, I recall it complained it wasn't a datetime, or I couldn't convert to eastern as an index. Then when I made the new version, it said it was a copy and I did the deep copy to get rid of that error. There must be a better way.
CodePudding user response:
You can try with indexer_between_time
idx = pd.to_datetime(df.index).tz_convert('US/Eastern').indexer_between_time('09:00', '16:00')
out = df.iloc[idx]
CodePudding user response:
You can avoid a few lines. Since timestamp is already your index, we can convert it to a DatetimeIndex directly.
mdf.index = pd.to_datetime(mdf.index)
new = mdf.between_time('09:00', '16:00').reset_index().drop('id', axis=1)
