Published in

2014 Tenth European Dependable Computing Conference

DOI: 10.1109/edcc.2014.15

Links

Tools

Export citation

Search in Google Scholar

Do I Need to Fix a Failed Component Now, or Can I Wait Until Tomorrow?

Proceedings article published in 2014 by Muffy Calder ORCID, Michele Sevegnani ORCID
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

We investigate how predictive event-based modelling can inform operational decision making in complex systems with component failures. By relating the status of components to service availability, and using stochastic temporal logic reasoning, we quantify the risk of service failure now, and in the future, after a given elapsed time. Decisions can then be taken according to those risks. We demonstrate the approach through application to an industrial case study system in which component failures are sensed and monitored. The system has been deployed for some time. A novel aspect is we calibrate the model(s) according to inferences over historical field data, thus the results of our reasoning can inform decision making in the actual deployed system.