Mixture Model Identification using Segmented Regression — mixmod

This function uses piecewise linear regression to divide the data into subgroups. See 'Details'.

mixmod_regression(x, ...)

# S3 method for wt_cdf_estimation
mixmod_regression(
  x,
  distribution = c("weibull", "lognormal", "loglogistic"),
  conf_level = 0.95,
  k = 2,
  control = segmented::seg.control(),
  ...
)

Arguments

x: A tibble with class wt_cdf_estimation returned by estimate_cdf.
...: Further arguments passed to or from other methods. Currently not used.
distribution: Supposed distribution of the random variable.
conf_level: Confidence level of the interval.
k: Number of mixture components. If the data should be split in an automated fashion, k must be set to NULL. The argument fix.psi of control is then set to FALSE.
control: Output of the call to seg.control, which is passed to segmented.lm. See 'Examples' for usage.

Value

A list with classes wt_model and wt_rank_regression if no breakpoint was detected. See rank_regression. A list with classes wt_model and wt_mixmod_regression if at least one breakpoint was determined. The length of the list depends on the number of identified subgroups. Each list element contains the information provided by rank_regression. In addition, the returned tibble data of each list element only retains information on the failed units and has two more columns:

q : Quantiles of the standard distribution calculated from column prob.
group : Membership to the respective segment.

If more than one method was specified in estimate_cdf, the resulting output is a list with classes wt_model and wt_mixmod_regression_list where each list element has classes wt_model and wt_mixmod_regression.

Details

The segmentation process is based on the lifetime realizations of failed units and their corresponding estimated failure probabilities for which intact items are taken into account. It is performed with the support of segmented.lm.

Segmentation can be done with a specified number of subgroups or in an automated fashion (see argument k). The algorithm tends to overestimate the number of breakpoints when the separation is done automatically (see 'Warning' in segmented.lm).

In the context of reliability analysis it is important that the main types of failures can be identified and analyzed separately. These are

early failures,
random failures and
wear-out failures.

In order to reduce the risk of overestimation as well as being able to consider the main types of failures, a maximum of three subgroups (k = 3) is recommended.

References

Doganaksoy, N.; Hahn, G.; Meeker, W. Q., Reliability Analysis by Failure Mode, Quality Progress, 35(6), 47-52, 2002

Examples

# Reliability data preparation:
## Data for mixture model:
data_mix <- reliability_data(
  voltage,
  x = hours,
  status = status
)

## Data for simple unimodal distribution:
data <- reliability_data(
  shock,
  x = distance,
  status = status
)

# Probability estimation with one method:
prob_mix <- estimate_cdf(
  data_mix,
  methods = "johnson"
)

prob <- estimate_cdf(
  data,
  methods = "johnson"
)

# Probability estimation for multiple methods:
prob_mix_mult <- estimate_cdf(
  data_mix,
  methods = c("johnson", "kaplan", "nelson")
)

# Example 1 - Mixture identification using k = 2 two-parametric Weibull models:
mix_mod_weibull <- mixmod_regression(
  x = prob_mix,
  distribution = "weibull",
  conf_level = 0.99,
  k = 2
)

# Example 2 - Mixture identification using k = 3 two-parametric lognormal models:
mix_mod_lognorm <- mixmod_regression(
  x = prob_mix,
  distribution = "lognormal",
  k = 3
)

# Example 3 - Mixture identification for multiple methods specified in estimate_cdf:
mix_mod_mult <- mixmod_regression(
  x = prob_mix_mult,
  distribution = "loglogistic"
)

# Example 4 - Mixture identification using control argument:
mix_mod_control <- mixmod_regression(
  x = prob_mix,
  distribution = "weibull",
  control = segmented::seg.control(display = TRUE)
)
#> boot sample =  1  opt.dev = 1.42543  n.psi = 1  est.psi = -0.948 
#> boot sample =  2  opt.dev = 1.42543  n.psi = 1  est.psi = -0.948 
#> boot sample =  3  opt.dev = 1.42543  n.psi = 1  est.psi = -0.948 
#> boot sample =  4  opt.dev = 1.42543  n.psi = 1  est.psi = -0.948 
#> boot sample =  5  opt.dev = 1.42543  n.psi = 1  est.psi = -0.948 
#> boot sample =  6  opt.dev = 1.42543  n.psi = 1  est.psi = -0.948 

# Example 5 - Mixture identification performs rank_regression for k = 1:
mod <- mixmod_regression(
  x = prob,
  distribution = "weibull",
  k = 1
)