Home > Software design >  Better way to filter dates in pandas
Better way to filter dates in pandas

Time:01-21

I have a dataframe where the index is in UTC. I want to filter the rows in Us/Eastern. The data looks like this:

                            open   high     low   close   volume  \
timestamp                                                           
2022-01-12 14:00:00-05:00  69.485  69.56  69.445  69.540   800424   
2022-01-12 14:30:00-05:00  69.530  69.60  69.480  69.550   675122   
2022-01-12 15:00:00-05:00  69.555  69.57  69.455  69.505   747475   
2022-01-12 15:30:00-05:00  69.500  69.67  69.410  69.560  2655043   
2022-01-12 16:00:00-05:00  69.540  69.54  69.540  69.540   718856

I only want the rows for each day between some time, like 9am to 4pm EST.

This code does what I want it to do, but seems like there must be an easier more direct way.

    mdf.reset_index(inplace=True)
    mdf["timestamp"] = pd.to_datetime(mdf['timestamp'])
    mdf['timestamp'] = mdf['timestamp'].dt.tz_convert('US/Eastern')
    mdf.index = pd.DatetimeIndex(mdf.timestamp)
    tdf = mdf.between_time('09:00', '16:00')
    bdf = tdf.copy(deep=True)
    bdf.drop(['timestamp'],axis=1, inplace=True)

When I tried between_time on the index, I recall it complained it wasn't a datetime, or I couldn't convert to eastern as an index. Then when I made the new version, it said it was a copy and I did the deep copy to get rid of that error. There must be a better way.

CodePudding user response:

You can try with indexer_between_time

idx = pd.to_datetime(df.index).tz_convert('US/Eastern').indexer_between_time('09:00', '16:00')
out = df.iloc[idx]

CodePudding user response:

You can avoid a few lines. Since timestamp is already your index, we can convert it to a DatetimeIndex directly.

mdf.index = pd.to_datetime(mdf.index)
new = mdf.between_time('09:00', '16:00').reset_index().drop('id', axis=1)
  •  Tags:  
  • Related