I have a dataset that is formatted in the below manner, and I'm trying to reformat it in a way that instead of DateTime column, id get the mean order count for each day (Monday,Tuesday, etc.)
| TIMESTAMP | TEMPERATURE | WINDSPEED |
|---|---|---|
| 2020-08-01 | 13.2 | 4.9 |
| 2020-08-01 | 15 | 5 |
| 2020-08-02 | 16 | 2.4 |
| 2020-08-02 | 14.2 | 6.3 |
| 2020-09-10 | 17.5 | 2 |
| 2020-09-10 | 9 | 8.3 |
Heres my code so far, everything seems to work fine and I can print the mean order count by day, by itself, but when trying to import it into the data set the ordercount is Nan
df = pd.read_csv('orders_autumn_2020.csv')
df['TIMESTAMP']= pd.to_datetime(df['TIMESTAMP'])
df_mod = df.groupby(df['TIMESTAMP'].dt.weekday).mean()
datecount = df.resample('D', on='TIMESTAMP').count()
ORDCOUNT = (datecount['WINDSPEED'])
df_mod["ORDCOUNT"] = ORDCOUNT
df_mod = df_mod[["TEMPERATURE","WIND_SPEED","ORDCOUNT"]]
print(df_mod)
| TIMESTAMP | TEMPERATURE | WINDSPEED | ORDCOUNT |
|---|---|---|---|
| 0 | 17.055038 | 4.027295 | NaN |
| 1 | 15.961699 | 2.951472 | NaN |
| 2 | 16.305026 | 3.600513 | NaN |
| 3 | 16.142084 | 4.051359 | NaN |
| 4 | 16.864189 | 3.131984 | NaN |
| 5 | 17.364454 | 4.230898 | NaN |
| 6 | 18.321807 | 4.310171 | NaN |
CodePudding user response:
In your solution aggregate by 2 different values - by weekday and by D for days, so indices are different so if assign column get NaNs.
Possible solution if need count per weekday with omit NaNs if exist in WINDSPEED column with GroupBy.count:
df1 = (df.groupby(df['TIMESTAMP'].dt.weekday)
.agg(TEMPERATURE = ('TEMPERATURE','mean'),
WINDSPEED = ('WINDSPEED','mean'),
ORDCOUNT = ('WINDSPEED','count')))
Or if need count per days use Grouper:
df2 = (df.groupby(pd.Grouper(freq='d', key='TIMESTAMP'))
.agg(TEMPERATURE = ('TEMPERATURE','mean'),
WINDSPEED = ('WINDSPEED','mean'),
ORDCOUNT = ('WINDSPEED','count')))
Your solution:
df_mod = df.groupby(df['TIMESTAMP'].dt.weekday).mean()
datecount = df.resample('D', on='TIMESTAMP')['WINDSPEED'].count()
df_mod["ORDCOUNT"] = datecount.groupby(datecount.index.weekday).sum()
