R> ecdf
function (x)
{
x <- sort(x)
n <- length(x)
if (n < 1)
stop("'x' must have 1 or more non-missing values")
vals <- unique(x)
rval <- approxfun(vals, cumsum(tabulate(match(x, vals)))/n,
method = "constant", yleft = 0, yright = 1, f = 0, ties = "ordered")
class(rval) <- c("ecdf", "stepfun", class(rval))
assign("nobs", n, envir = environment(rval))
attr(rval, "call") <- sys.call()
rval
}
The above is the code of ecdf(). I see that the return value is assigned with class stepfun. I don't understand what it is for. Given approxfun() does linear interpolation, why stepfun is needed? Could anybody help explain the purpose of adding stepfun to the class?
CodePudding user response:
In the call to approxfun() in ecdf(), the method is "constant", which means it isn't doing linear interpolation, it's generating a step function.
To find out what the class on the result affects, you can use the methods() function:
methods()
#> [1] knots lines plot print summary
#> see '?methods' for accessing help and source code
Created on 2022-02-12 by the reprex package (v2.0.1.9000)
So potentially if you call any of the knots(), lines(), plot(), print() or summary() functions on the result you'll get behaviour tailored to step functions. However, the class of the result is computed as c("ecdf", "stepfun", class(rval)), so there might be "ecdf" methods that override the "stepfun" methods:
methods()
#> [1] plot print quantile summary
#> see '?methods' for accessing help and source code
Created on 2022-02-12 by the reprex package (v2.0.1.9000)
Yes, the plot(), print() and summary() functions will call the "ecdf" methods in preference to the "stepfun" methods. That still leaves knots() and lines(), and conceivably the others could call NextMethod() to get to the "stepfun" methods.
One clarification: the method argument to approxfun() is just the name of an argument; in the discussion above, "methods" was used to refer to one of the ways R does object oriented programming, using methods and classes.
