I am reading a .csv data file using pd.read_csv and I get these first 5 rows from my global dataframe (containing thousands of rows):
time id time_offset
0 2017-12-01 21:00:00 0 -60
1 2017-12-01 21:01:00 0 -59
2 2017-12-01 21:02:00 0 -58
3 2017-12-01 21:03:00 0 -57
4 2017-12-01 21:04:00 0 -56
I'm not very good at manipulating dates in Python and I haven't found how to do this manipulation:
- create in my dataframe a new
hourcolumn from the existingtimecolumn, containing only thehours:minutes:secondsdata, which should be:21:00:00,21:01:00,21:02:00, etc... - then create another column
secondsfrom the newly createdhour, containing the number of seconds elapsed since time0, which should be:75600(calculated as 21x3600),75601(calculated ,as 21x3600 1), etc...
Any help in sorting this out would be much appreciated.
CodePudding user response:
You can try:
# convert `time` column to datetime (if necessary):
df["time"] = pd.to_datetime(df["time"])
df["hour"] = df["time"].dt.time
df["seconds"] = (
df["time"].dt.hour * 60 * 60
df["time"].dt.minute * 60
df["time"].dt.second
)
print(df)
Prints:
time id time_offset hour seconds
0 2017-12-01 21:00:00 0 -60 21:00:00 75600
1 2017-12-01 21:01:00 0 -59 21:01:00 75660
2 2017-12-01 21:02:00 0 -58 21:02:00 75720
3 2017-12-01 21:03:00 0 -57 21:03:00 75780
4 2017-12-01 21:04:00 0 -56 21:04:00 75840
CodePudding user response:
Assignment of the datetime series as the index is typically useful. Use pd.to_datetime() converts it to a usable format.
df.index = pd.to_datetime(df['time'])
df.drop('time',axis=1)
- can use the strftime function - https://strftime.org/
df['time'] = df.index.strftime("%H:%M:%S")
- since
df.index[0]is the very first time you can subtract and use.secondsattribute:
df['seconds since'] = (df.index = df.index[0]).seconds
