Home > Net >  Pandas grouping by week
Pandas grouping by week

Time:01-24

I have a data frame in pandas like this:

Name  Date
A     9/1/21
B     10/20/21
C     9/8/21
D     9/20/21
K     9/29/21
K     9/15/21
M     10/1/21
C     9/12/21
D     9/9/21
C     9/9/21
R     9/20/21

I need to get the count of items by week.

weeks = [9/6/21, 9/13, 9/20/21, 9/27/21, 10/4/21]  

Example: From 9/6 to 9/13, the output should be:

Name  Weekly count
A     0
B     0
C     3
D     1
M     0
K     0
R     0

Similarly, I need to find the count on these intervals: 9/13 to 9/20, 9/20 to 9/27, and 9/27 to 10/4. Thank you!

CodePudding user response:

Here is my very rough solution to how to handle a problem like this. I only handle the first range of weeks to demonstrate how it works, but you can easily loop through weeks (replace weeks[0] and weeks[1] with weeks[i] and weeks[i 1] in the loop) and get the rest of the results:

import pandas as pd
from datetime import datetime

d = {"Name": ["A", "B", "C", "D", "K", "K", "M", "C", "D", "C", "R"],
"Date": ["9/1/21", "10/20/21", "9/8/21", " 9/20/21","9/29/21","9/15/21","10/1/21","9/12/21","9/9/21","9/9/21","9/20/21"]}

df = pd.DataFrame(data=d)

df_copy = df.copy()


weeks = ["9/6/21", "9/13/21", "9/20/21", "9/27/21", "10/4/21"]

df['Date'] = pd.to_datetime(df['Date'])  

mask = (df['Date'] >= datetime.strptime(weeks[0], "%m/%d/%y")) & (df['Date'] <= datetime.strptime(weeks[1],"%m/%d/%y"))


df = df.loc[mask]

out = df.groupby('Name').count().reset_index()


for row in df_copy.iterrows():
    if row[1]["Name"] not in list(out["Name"]):
        #print(row[1]["Name"])
        to_add = {"Name": row[1]["Name"], "Date" : 0}
        out = out.append(to_add, ignore_index=True)

out = out.sort_values(by=['Name']).rename(columns={"Date":"Weekly Count"})

print(out)

Note: There is probably a much cleaner way to do this other than adding the rows with 0 matching weeks back in at the end, but I do not know how.

Output:

enter image description here

  •  Tags:  
  • Related