This function applies a non-parametric method to estimate the failure probabilities of complete data taking (multiple) right-censored observations into account.
# S3 method for default
estimate_cdf(
x,
status,
id = NULL,
method = c("mr", "johnson", "kaplan", "nelson"),
options = list(),
...
)
A numeric vector which consists of lifetime data. Lifetime data could be every characteristic influencing the reliability of a product, e.g. operating time (days/months in service), mileage (km, miles), load cycles.
A vector of binary data (0 or 1) indicating whether unit i is a right censored observation (= 0) or a failure (= 1).
A vector for the identification of every unit. Default is NULL
.
Method used for the estimation of failure probabilities. See 'Details'.
A list of named options. See 'Options'.
Further arguments passed to or from other methods. Currently not used.
A tibble with class wt_cdf_estimation
containing the following columns:
id
: Identification for every unit.
x
: Lifetime characteristic.
status
: Binary data (0 or 1) indicating whether a unit is a right
censored observation (= 0) or a failure (= 1).
rank
: The (computed) ranks. Determined for methods "mr"
and "johnson"
,
filled with NA
for other methods or if status = 0
.
prob
: Estimated failure probabilities, NA
if status = 0
.
cdf_estimation_method
: Specified method for the estimation of failure
probabilities.
The following techniques can be used for the method
argument:
"mr"
: Method Median Ranks is used to estimate the failure probabilities
of failed units without considering censored items. Tied observations can be
handled in three ways (See 'Options'):
"max"
: Highest observed rank is assigned to tied observations.
"min"
: Lowest observed rank is assigned to tied observations.
"average"
: Mean rank is assigned to tied observations.
Two formulas can be used to determine cumulative failure probabilities F(t) (See 'Options'):
"benard"
: Benard's approximation for Median Ranks.
"invbeta"
: Exact Median Ranks using the inverse beta distribution.
"johnson"
: The Johnson method is used to estimate the failure
probabilities of failed units, taking censored units into account. Compared
to complete data, correction of probabilities is done by the computation of
adjusted ranks. Two formulas can be used to determine cumulative failure
probabilities F(t) (See 'Options'):
"benard"
: Benard's approximation for Median Ranks.
"invbeta"
: Exact Median Ranks using the inverse beta distribution.
"kaplan"
: The method of Kaplan and Meier is used to estimate the
survival function S(t) with respect to (multiple) right censored data.
The complement of S(t), i.e. F(t), is returned. In contrast to the
original Kaplan-Meier estimator, one modification is made (see 'References').
"nelson"
: The Nelson-Aalen estimator models the cumulative hazard rate
function in case of (multiple) right censored data. Equating the formal
definition of the hazard rate with that according to Nelson-Aalen results
in a formula for the calculation of failure probabilities.
Argument options
is a named list of options:
Method | Name | Value |
mr | mr_method | "benard" (default) or "invbeta" |
mr | mr_ties.method | "max" (default), "min" or "average" |
johnson | johnson_method | "benard" (default) or "invbeta" |
NIST/SEMATECH e-Handbook of Statistical Methods, 8.2.1.5. Empirical model fitting - distribution free (Kaplan-Meier) approach, NIST SEMATECH, December 3, 2020
# Vectors:
cycles <- alloy$cycles
status <- alloy$status
# Example 1 - Johnson method:
prob_tbl <- estimate_cdf(
x = cycles,
status = status,
method = "johnson"
)
# Example 2 - Method 'mr' with options:
prob_tbl_2 <- estimate_cdf(
x = cycles,
status = status,
method = "mr",
options = list(
mr_method = "invbeta",
mr_ties.method = "average"
)
)
#> The 'mr' method only considers failed units (status == 1) and does not retain intact units (status == 0).