I have a data frame like below. I have made the data frame shorter to put the idea across easily
| `Category | Issue | WK 1 | WK 2 |
|---|---|---|---|
| Pending B | C | 1 | 2 |
| Pending B | E | 3 | 4 |
| Pending B | R | 4 | 5 |
| Pending C | C | 1 | 2 |
| Pending C | E | 3 | 4 |
| Pending C | R | 4 | 5 |
| Resolved | C | 1 | 2 |
| Resolved | E | 3 | 4 |
| Resolved | R | 4 | 5 |
| ----------: | --------------: | ----: | :---: |
| Total | 24 | 33 |
` Using the formulars below:
formular for WK 1 column
C-WK 1/(Total WK 1 - (sum of pending C WK 1))
E-WK 1/(Total WK 1 - (sum of pending C WK 1))
R-WK 1/(Total WK 1 - (sum of pending C WK 1))
formular for WK 2 column
C-WK 2/(Total WK 2 - (sum of pending C WK 2))
E-WK 2/(Total WK 2 - (sum of pending C WK 2))
R-WK 2/(Total WK 2 - (sum of pending C WK 2))`
at the end i want to have a data frame like below.
| Category | Issue | WK 1 | WK 2 | WK 1(R) | WK 2(R) |
|---|---|---|---|---|---|
| Resolved | C | 1 | 2 | 0.0625 | 0.090909 |
| Resolved | E | 3 | 4 | 0.1875 | 0.181818 |
| Resolved | R | 4 | 5 | 0.25 | 0.227273 |
| ----------: | --------------: | ----: | :---: | -------: | :------: |
CodePudding user response:
You can construct a function to calculate the columns of WK 1(R) and WK 2(R) and after that, use .loc to select the rows where "Category" is "Resolved".
def calculate_wk_resolution(df):
# Calculate WK 1(R) and WK 2(R)
df["WK 1(R)"] = df["WK 1"] / (df["WK 1"].sum() - df[df["Category"] == "Pending C"]["WK 1"].sum())
df["WK 2(R)"] = df["WK 2"] / (df["WK 2"].sum() - df[df["Category"] == "Pending C"]["WK 2"].sum())
calculate_wk_resolution(df)
out = df.loc[df["Category"] == "Resolved", ["Category", "Issue", "WK 1", "WK 2", "WK 1(R)", "WK 2(R)"]]
print(out)
Output:
Category Issue WK 1 WK 2 WK 1(R) WK 2(R)
6 Resolved C 1 2 0.0625 0.090909
7 Resolved E 3 4 0.1875 0.181818
8 Resolved R 4 5 0.2500 0.227273
