I have a csv file like this:
2021-01-05 10:57:12.762000, REDDE EHZ AM 00, trigger
2021-01-05 10:58:26.622000, REDDE EHZ AM 00, trigger
2021-01-05 11:02:16.772000, REDDE EHZ AM 00, trigger
2021-01-05 11:02:34.042000, REDDE EHZ AM 00, trigger
2021-01-05 17:12:07.221999, REDDE EHZ AM 00, trigger
2021-01-06 01:42:45.501999, REDDE EHZ AM 00, trigger
2021-01-06 01:44:24.481999, REDDE EHZ AM 00, trigger
2021-01-06 01:44:58.051999, REDDE EHZ AM 00, trigger
2021-01-06 01:45:14.871999, REDDE EHZ AM 00, trigger
2021-01-06 01:47:10.901999, REDDE EHZ AM 00, trigger
2021-01-06 07:57:33.221999, REDDE EHZ AM 00, trigger
2021-01-06 07:57:48.821999, REDDE EHZ AM 00, trigger
2021-01-06 07:58:51.031999, REDDE EHZ AM 00, trigger
2021-01-06 07:59:27.001999, REDDE EHZ AM 00, trigger
2021-01-06 08:00:56.871999, REDDE EHZ AM 00, trigger
2021-01-06 11:28:17.191999, REDDE EHZ AM 00, trigger
2021-01-06 11:28:46.201999, REDDE EHZ AM 00, trigger
2021-01-06 11:29:19.111999, REDDE EHZ AM 00, trigger
2021-01-06 11:29:41.891999, REDDE EHZ AM 00, trigger
2021-01-06 11:30:51.901999, REDDE EHZ AM 00, trigger
2021-01-06 11:31:21.921999, REDDE EHZ AM 00, trigger
2021-01-06 11:32:23.001999, REDDE EHZ AM 00, trigger
2021-01-06 11:32:58.271999, REDDE EHZ AM 00, trigger
2021-01-07 11:33:46.891999, REDDE EHZ AM 00, trigger
2021-01-07 12:38:50.021999, REDDE EHZ AM 00, trigger
2021-01-07 12:39:53.881999, REDDE EHZ AM 00, trigger
2021-01-08 12:42:07.371999, REDDE EHZ AM 00, trigger
2021-01-08 12:42:46.441999, REDDE EHZ AM 00, trigger
2021-01-09 12:44:14.291999, REDDE EHZ AM 00, trigger
I added the header with:
df = pd.read_csv(r'D:\Inves\SM\CC_Cbba\REDPy\OSCREDDE_3_\redde_3_trigs.dat',
sep=',', header=None, usecols=[0, 1, 2])
headers = ["TrigDT", "Sta", "Type"]
The output is:
TrigDT Sta Type
0 2021-01-05 10:57:12.762000 REDDE EHZ AM 00 trigger
1 2021-01-05 10:58:26.622000 REDDE EHZ AM 00 trigger
2 2021-01-05 11:02:16.772000 REDDE EHZ AM 00 trigger
3 2021-01-05 11:02:34.042000 REDDE EHZ AM 00 trigger
4 2021-01-05 17:12:07.221999 REDDE EHZ AM 00 trigger
...
I created a Date colum in roder to try to group the information by days:
df['TrigDT'] = pd.to_datetime(df['TrigDT'])
df['Date'] = df['TrigDT'].dt.date
I tried to have a cumulative sum with Index because I do not have a colum with counter of events, then I tried to group by days but I fail:
df = df.groupby('Date').index.sum()
df = df.groupby(df.index.day).cumsum().reset_index()
The idea is to create a cumulative plot with the DataFrame information (X axis with dates a Y axis with cumulative information), I tried to have a plot
Thank you
CodePudding user response:
Not sure I understood what a, b, c would be exactly in your chart, but here is how you can accumulate the number of triggers daily:
# Load the data and set 'TrigDT' as the index
df = pd.read_csv('path_to_file.csv', header=None, usecols=(0, 1, 2), names=('TrigDT', 'Sta', 'Type'))
df.TrigDT = pd.to_datetime(df.TrigDT)
df.set_index('TrigDT', inplace=True)
# Resample to daily and count
daily_acc = df.resample('d').count().cumsum()
daily_acc.plot()
