Software failures are still a major concern in mission- and enterprise-critical contexts, despite significant efforts spent in software testing. In fact, while software testing is effective against easily-reproducible bugs (Bohrbugs), it is considerably less suitable for dealing with bugs that lead to hard-to-reproduce failures (Mandelbugs). On the positive side, the elusive nature of Mandelbugs provides opportunities for failure recovery, which are investigated in this paper. Based on real cases of Mandelbugs in eleven Information Technology (IT) systems running in production, the paper proposes a model that describes the recovery processes in IT systems. It then presents closed-form expressions, and a numerical analysis, of the mean time to recovery, and the software (un)availability. This analysis allows the designer to compare recovery strategies, as well as to determine the parameters having a high influence on the efficacy of recovery from failures caused by Mandelbugs.

Recovery from Software Failures Caused by Mandelbugs / Grottke, Michael; Kim, Dong Seong; Mansharamani, Rajesh; Nambiar, Manoj; Natella, Roberto; Trivedi, Kishor S.. - In: IEEE TRANSACTIONS ON RELIABILITY. - ISSN 0018-9529. - 65:1(2016), pp. 70-87. [10.1109/TR.2015.2452933]

Recovery from Software Failures Caused by Mandelbugs

Natella, Roberto
;
Trivedi, Kishor S.
2016

Abstract

Software failures are still a major concern in mission- and enterprise-critical contexts, despite significant efforts spent in software testing. In fact, while software testing is effective against easily-reproducible bugs (Bohrbugs), it is considerably less suitable for dealing with bugs that lead to hard-to-reproduce failures (Mandelbugs). On the positive side, the elusive nature of Mandelbugs provides opportunities for failure recovery, which are investigated in this paper. Based on real cases of Mandelbugs in eleven Information Technology (IT) systems running in production, the paper proposes a model that describes the recovery processes in IT systems. It then presents closed-form expressions, and a numerical analysis, of the mean time to recovery, and the software (un)availability. This analysis allows the designer to compare recovery strategies, as well as to determine the parameters having a high influence on the efficacy of recovery from failures caused by Mandelbugs.
2016
Recovery from Software Failures Caused by Mandelbugs / Grottke, Michael; Kim, Dong Seong; Mansharamani, Rajesh; Nambiar, Manoj; Natella, Roberto; Trivedi, Kishor S.. - In: IEEE TRANSACTIONS ON RELIABILITY. - ISSN 0018-9529. - 65:1(2016), pp. 70-87. [10.1109/TR.2015.2452933]
File in questo prodotto:
File Dimensione Formato  
07173065.pdf

solo utenti autorizzati

Tipologia: Documento in Post-print
Licenza: Accesso privato/ristretto
Dimensione 1.87 MB
Formato Adobe PDF
1.87 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/697438
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 27
  • ???jsp.display-item.citation.isi??? 23
social impact