The analysis of monitoring data is extremely valuable for critical computer systems. It allows to gain insights into the failure behavior of a given system under real workload conditions, which is crucial to assure service continuity and downtime reduction. This paper proposes an experimental evaluation of different direct monitoring techniques, namely event logs, assertions, and source code instrumentation, that are widely used in the context of critical industrial systems. We inject 12,733 software faults in a real-world air traffic control (ATC) middleware system with the aim of analyzing the ability of mentioned techniques to produce information in case of failures. Experimental results indicate that each technique is able to cover a limited number of failure manifestations. Moreover, we observe that the quality of collected data to support failure diagnosis tasks strongly varies across the techniques considered in this study.

Assessing Direct Monitoring Techniques to Analyze Failures of Critical Industrial Systems / Cinque, Marcello; Cotroneo, Domenico; DELLA CORTE, Raffaele; Pecchia, Antonio. - (2014), pp. 212-222. (Intervento presentato al convegno 25th IEEE International Symposium on Software Reliability Engineering, ISSRE 2014, Naples, Italy, November 3-6, 2014 tenutosi a Naples nel 2014) [10.1109/ISSRE.2014.30].

Assessing Direct Monitoring Techniques to Analyze Failures of Critical Industrial Systems

CINQUE, MARCELLO;COTRONEO, DOMENICO;DELLA CORTE, RAFFAELE;PECCHIA, ANTONIO
2014

Abstract

The analysis of monitoring data is extremely valuable for critical computer systems. It allows to gain insights into the failure behavior of a given system under real workload conditions, which is crucial to assure service continuity and downtime reduction. This paper proposes an experimental evaluation of different direct monitoring techniques, namely event logs, assertions, and source code instrumentation, that are widely used in the context of critical industrial systems. We inject 12,733 software faults in a real-world air traffic control (ATC) middleware system with the aim of analyzing the ability of mentioned techniques to produce information in case of failures. Experimental results indicate that each technique is able to cover a limited number of failure manifestations. Moreover, we observe that the quality of collected data to support failure diagnosis tasks strongly varies across the techniques considered in this study.
2014
9781479960323
Assessing Direct Monitoring Techniques to Analyze Failures of Critical Industrial Systems / Cinque, Marcello; Cotroneo, Domenico; DELLA CORTE, Raffaele; Pecchia, Antonio. - (2014), pp. 212-222. (Intervento presentato al convegno 25th IEEE International Symposium on Software Reliability Engineering, ISSRE 2014, Naples, Italy, November 3-6, 2014 tenutosi a Naples nel 2014) [10.1109/ISSRE.2014.30].
File in questo prodotto:
File Dimensione Formato  
Assessing_Direct_Monitoring_Techniques_to_Analyze_Failures_of_Critical_Industrial_Systems.pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: Copyright dell'editore
Dimensione 809.67 kB
Formato Adobe PDF
809.67 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/599475
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 21
  • ???jsp.display-item.citation.isi??? 13
social impact