Home > Back-end >  How to establish successful rate by interval?
How to establish successful rate by interval?

Time:01-07

I have a data like this: enter image description here

Is it possible to establish successful rate by interval? For example: We are adding successful project states (count_n where states=successful) and divide it by the whole number of projects (sum of count_n where interval is 1-10) between 1-10 interval. The same we do with interval 1-20. I would like to get data like this: successful rate | interval X 1-10 Y 10-20

I'm coding in SAS but I can use SQL Server in it. Thanks.

CodePudding user response:

In PROC SQL, you can do it this way:

proc sql;
    create table want as
        select interval
             , sum( (upcase(state) = 'SUCCESSFUL')*count_n)/sum(count_n) format=percent8.1 as success_rate
        from have
        group by interval
    ;
quit;

The code (upcase(state) = 'SUCCESSFUL') produces a 1/0 value such that only rows where the state is successful are summed. Multiplying this by count_n will give 0 for non-successful states and count_n for successful states. This is a shortcut that prevents you from having to do multiple joins to get the required numerator.

Example code:

data have;
    length state $20.;
    input state$ count_n interval$;
    datalines;
successful 70 1-10
successful 10 1-10
fail 20 1-10
successful 70 11-20
successful 5 11-20
fail 25 11-20
;
run;

Output:

interval    success_rate
1-10        80.0%
11-20       75.0%

CodePudding user response:

I prefer using pre-defined SAS procedures whenever possible - they're typically more efficient.

For something like this you can use PROC FREQ. You need to specify the WEIGHT with the count to indicate that each observation is counted multiple times and then you can get a variety of percentages - COL_PCT is per category in this case.

Remove the WHERE/KEEP to see the full output and the different statistics it generates for you. Neither of these solutions accounts for missing values. If you need to, add the MISSING option within PROC FREQ.

proc freq data=have noprint;
table state*interval / out=want (keep = state interval count pct_col where=(state='successful')) missing outpct;
weight count_n;
run;

proc print data=want;
run;
  •  Tags:  
  • Related