The simple pattern minimality problem (SPMP) represents a central problem in the logical analysis of data and association rules mining, and it finds applications in several fields as logic synthesis, reliability analysis, and automated reasoning. It consists of determining the minimum number of patterns explaining all the observations of a data set, that is, a Boolean logic formula that is true for all the elements of the data set and false for all the unseen observations. We refer to this problem as covering SPMP (C-SPMP), because each observation can be explained (covered) by more than one pattern. Starting from a real industrial application, we also define a new version of the problem, and we refer to it as partitioning SPMP (P-SPMP), because each observation has to be covered just once. Given a propositional formula or a truth table, C-SPMP and P-SPMP coincide exactly with the problem of determining the minimum disjunctive and minimum exclusive disjunctive normal form, respectively. Both problems are known to be NP-hard and have been generally tackled by heuristic methods. In this context, the contribution of this work is twofold. On one side, it provides two original integer linear programming formulations for the two variants of the SPMP. These formulations exploit the concept of Boolean hypercube to build a graph representation of the problems and allow to exactly solve instances with more than 1,000 observations by using an MIP solver. On the other side, two effective and fast heuristics are proposed to solve relevant size instances taken from literature (SeattleSNPs) and from the industrial database. The proposed methods do not suffer from the same dimensional drawbacks of the methods present in the literature and outperform either existing commercial and freeware logic tools or the available industrial solutions in the number of generated patterns and/or in the computational burden.

Simple pattern minimality problems: Integer linear programming formulations and covering-based heuristic solving approaches / Boccia, M.; Sforza, A.; Sterle, C.. - In: INFORMS JOURNAL ON COMPUTING. - ISSN 1091-9856. - 32:4(2020), pp. 1049-1060. [10.1287/ijoc.2019.0940]

Simple pattern minimality problems: Integer linear programming formulations and covering-based heuristic solving approaches

Boccia M.;Sforza A.;Sterle C.
2020

Abstract

The simple pattern minimality problem (SPMP) represents a central problem in the logical analysis of data and association rules mining, and it finds applications in several fields as logic synthesis, reliability analysis, and automated reasoning. It consists of determining the minimum number of patterns explaining all the observations of a data set, that is, a Boolean logic formula that is true for all the elements of the data set and false for all the unseen observations. We refer to this problem as covering SPMP (C-SPMP), because each observation can be explained (covered) by more than one pattern. Starting from a real industrial application, we also define a new version of the problem, and we refer to it as partitioning SPMP (P-SPMP), because each observation has to be covered just once. Given a propositional formula or a truth table, C-SPMP and P-SPMP coincide exactly with the problem of determining the minimum disjunctive and minimum exclusive disjunctive normal form, respectively. Both problems are known to be NP-hard and have been generally tackled by heuristic methods. In this context, the contribution of this work is twofold. On one side, it provides two original integer linear programming formulations for the two variants of the SPMP. These formulations exploit the concept of Boolean hypercube to build a graph representation of the problems and allow to exactly solve instances with more than 1,000 observations by using an MIP solver. On the other side, two effective and fast heuristics are proposed to solve relevant size instances taken from literature (SeattleSNPs) and from the industrial database. The proposed methods do not suffer from the same dimensional drawbacks of the methods present in the literature and outperform either existing commercial and freeware logic tools or the available industrial solutions in the number of generated patterns and/or in the computational burden.
2020
Simple pattern minimality problems: Integer linear programming formulations and covering-based heuristic solving approaches / Boccia, M.; Sforza, A.; Sterle, C.. - In: INFORMS JOURNAL ON COMPUTING. - ISSN 1091-9856. - 32:4(2020), pp. 1049-1060. [10.1287/ijoc.2019.0940]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/828654
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 1
social impact