Calculating resolution rate per sub category in python-CodePudding

I have a data frame like below. I have made the data frame shorter to put the idea across easily

`Category	Issue	WK 1	WK 2
Pending B	C	1	2
Pending B	E	3	4
Pending B	R	4	5
Pending C	C	1	2
Pending C	E	3	4
Pending C	R	4	5
Resolved	C	1	2
Resolved	E	3	4
Resolved	R	4	5
----------:	--------------:	----:	:---:
Total		24	33

` Using the formulars below:

formular for WK 1 column

C-WK 1/(Total WK 1 - (sum of pending C WK 1))
E-WK 1/(Total WK 1 - (sum of pending C WK 1))
R-WK 1/(Total WK 1 - (sum of pending C WK 1))


formular for WK 2 column

C-WK 2/(Total WK 2 - (sum of pending C WK 2))
E-WK 2/(Total WK 2 - (sum of pending C WK 2))
R-WK 2/(Total WK 2 - (sum of pending C WK 2))`

at the end i want to have a data frame like below.

Category	Issue	WK 1	WK 2	WK 1(R)	WK 2(R)
Resolved	C	1	2	0.0625	0.090909
Resolved	E	3	4	0.1875	0.181818
Resolved	R	4	5	0.25	0.227273
----------:	--------------:	----:	:---:	-------:	:------:

CodePudding user response：

You can construct a function to calculate the columns of WK 1(R) and WK 2(R) and after that, use .loc to select the rows where "Category" is "Resolved".

def calculate_wk_resolution(df):
    # Calculate WK 1(R) and WK 2(R)
    df["WK 1(R)"] = df["WK 1"] / (df["WK 1"].sum() - df[df["Category"] == "Pending C"]["WK 1"].sum())
    df["WK 2(R)"] = df["WK 2"] / (df["WK 2"].sum() - df[df["Category"] == "Pending C"]["WK 2"].sum())
    
calculate_wk_resolution(df)

out = df.loc[df["Category"] == "Resolved", ["Category", "Issue", "WK 1", "WK 2", "WK 1(R)", "WK 2(R)"]]
print(out)

Output:

   Category  Issue  WK 1  WK 2  WK 1(R)  WK 2(R)
6  Resolved     C     1     2   0.0625  0.090909
7  Resolved     E     3     4   0.1875  0.181818
8  Resolved     R     4     5   0.2500  0.227273