A modification to the mixed-integer nonlinear programming (MINLP) formulation for symbolic regression was proposed with the aim of identification of physical models from noisy experimental data. In the proposed formulation, a binary tree in which equations are represented as directed, acyclic graphs, is fully constructed for a pre-defined number of layers. The introduced modification results in the reduction in the number of required binary variables and removal of redundancy due to possible symmetry of the tree formulation. The formulation was tested using numerical models and was found to be more efficient than the previous literature example with respect to the numbers of predictor variables and training data points. The globally optimal search was extended to identify physical models and to cope with noise in the experimental data predictor variable. The methodology was proven to be successful in identifying the correct physical models describing the relationship between shear stress and shear rate for both Newtonian and non-Newtonian fluids, and simple kinetic laws of chemical reactions. Future work will focus on addressing the limitations of the present formulation and solver to enable extension of target problems to larger, more complex physical models.

A new formulation for symbolic regression to identify physico-chemical laws from experimental data / Neumann, P.; Cao, L.; Russo, D.; Vassiliadis, V. S.; Lapkin, A. A.. - In: CHEMICAL ENGINEERING JOURNAL. - ISSN 1385-8947. - 387:(2020), p. 123412. [10.1016/j.cej.2019.123412]

A new formulation for symbolic regression to identify physico-chemical laws from experimental data

Russo D.;
2020

Abstract

A modification to the mixed-integer nonlinear programming (MINLP) formulation for symbolic regression was proposed with the aim of identification of physical models from noisy experimental data. In the proposed formulation, a binary tree in which equations are represented as directed, acyclic graphs, is fully constructed for a pre-defined number of layers. The introduced modification results in the reduction in the number of required binary variables and removal of redundancy due to possible symmetry of the tree formulation. The formulation was tested using numerical models and was found to be more efficient than the previous literature example with respect to the numbers of predictor variables and training data points. The globally optimal search was extended to identify physical models and to cope with noise in the experimental data predictor variable. The methodology was proven to be successful in identifying the correct physical models describing the relationship between shear stress and shear rate for both Newtonian and non-Newtonian fluids, and simple kinetic laws of chemical reactions. Future work will focus on addressing the limitations of the present formulation and solver to enable extension of target problems to larger, more complex physical models.
2020
A new formulation for symbolic regression to identify physico-chemical laws from experimental data / Neumann, P.; Cao, L.; Russo, D.; Vassiliadis, V. S.; Lapkin, A. A.. - In: CHEMICAL ENGINEERING JOURNAL. - ISSN 1385-8947. - 387:(2020), p. 123412. [10.1016/j.cej.2019.123412]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/895647
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 28
  • ???jsp.display-item.citation.isi??? 24
social impact