I would like to transform the dictionary below
x = {
'John': ['1.12.2021','2.12.2021','3.02.2022','4.2.2022','5.07.2022','6.07.2022','7.12.2022','8.12.2022'],
'Andrew': ['1.12.2021','2.03.2022','3.03.2022','4.05.2022','5.05.2022','6.09.2022','7.09.2022','8.11.2022','9.12.2022','10.12.2022']}
into a DataFrame like this, with columns that will show counts per Month
Name 12.2021 02.2022 03.2022 05.2022 07.2022 09.2022 11.2022 \
John 2 2 0 0 2 0 0
Andrew 1 0 2 2 0 2 1
12.2022
2
2
I started with this this, transforming the values into datetimes :
x = pd.DataFrame.from_dict(x, 'index').reset_index().fillna(value='0')
x.iloc[:,1:] = pd.to_datetime(x.iloc[:,1:], format='%d.%m.%Y')
print(x)
But getting this error: AttributeError: 'int' object has no attribute 'lower'
CodePudding user response:
You could use a dictionary comprehension and pandas.concat unstack:
day first
(pd.concat({k: pd.to_datetime(pd.Series(v), dayfirst=True)
.dt.strftime('%Y-%m')
.value_counts()
for k,v in x.items()})
.unstack(fill_value=0)
)
output:
2021-12 2022-02 2022-03 2022-05 2022-07 2022-09 2022-11 2022-12
John 2 2 0 0 2 0 0 2
Andrew 1 0 2 2 0 2 1 2
previous answer: month first (before data update)
(pd.concat({k: pd.Series(v, dtype='datetime64[ns]')
.dt.strftime('%Y-%m')
.value_counts()
for k,v in x.items()})
.unstack(fill_value=0)
)
output:
2021-01 2021-02 2021-03 2021-04 2021-05 2021-06 2021-07 2021-08 2021-09 2021-10
John 1 1 1 1 1 1 1 1 0 0
Andrew 1 1 1 1 1 1 1 1 1 1
