An illustrative field dataset that contains a variety of variables commonly collected in the automotive sector.
The dataset has complete information about failed and incomplete information about intact vehicles. See 'Format' and 'Details' for further insights.
field_data
A tibble with 10,684 rows and 20 variables:
Vehicle identification number.
Days in service.
Distances covered, which are unknown for censored units.
1
for failed and 0
for censored units.
Date of production.
Date of registration. Known for all failed units and for a few intact units.
The date on which the failure was repaired. It is assumed that the repair date is equal to the date of failure occurrence.
The date on which lifetime information about the failure were available.
Delivering country.
The region within the country of delivery. Known for registered
vehicles, NA
for units with a missing registration_date
.
Climatic zone based on "Köppen-Geiger" climate classification.
Known for registered vehicles, NA
for units with a missing registration_date
.
Climatic subzone based on "Köppen-Geiger" climate classification.
Known for registered vehicles, NA
for units with a registration_date
.
Brand of the vehicle.
Model of the vehicle.
Type of the engine.
Date where the engine was installed.
Type of the gear.
Date where the gear was installed.
Transmission of the vehicle.
Vehicle fuel.
All vehicles were produced in 2014 and an analysis of the field data was made at the end of 2015. At the date of analysis, there were 684 failed and 10,000 intact vehicles.
Censored vehicles:
For censored units the service time (dis
) was computed as the difference
of the date of analysis "2015-12-31"
and the registration_date
.
For many units the latter date is unknown. For these, the difference of the
analysis date and production_date
was used to get a rough estimation of
the real service time. This uncertainty has to be considered in the subsequent
analysis (see delay in registration in the section 'Details' of
mcs_delay
).
Furthermore, due to the delay in report, the computed service time could also
be inaccurate. This uncertainty should be considered as well (see
delay in report in the section 'Details' of mcs_delay
).
The lifetime characteristic mileage
is unknown for all censored units.
If an analysis is to be made for this lifetime characteristic, covered distances
for these units have to be estimated (see mcs_mileage
).
Failed vehicles:
For failed units the service time (dis
) is computed as the difference
of repair_date
and registration_date
, which are known for all of them.