: The chemical properties of metal complexes are strongly dependent on the number and geometrical arrangement of ligands coordinated to the metal center. Existing methods for determining either coordination number or geometry rely on a trade-off between accuracy and computational costs, which hinders their application to the study of large structure data sets. Here, we propose MetalHawk (https://github.com/vrettasm/MetalHawk), a machine learning-based approach to perform simultaneous classification of metal site coordination number and geometry through artificial neural networks (ANNs), which were trained using the Cambridge Structural Database (CSD) and Metal Protein Data Bank (MetalPDB). We demonstrate that the CSD-trained model can be used to classify sites belonging to the most common coordination numbers and geometry classes with balanced accuracy equal to 96.51% for CSD-deposited metal sites. The CSD-trained model was also found to be capable of classifying bioinorganic metal sites from the MetalPDB database, with balanced accuracy equal to 84.29% on the whole PDB data set and to 91.66% on manually reviewed sites in the PDB validation set. Moreover, we report evidence that the output vectors of the CSD-trained model can be considered as a proxy indicator of metal-site distortions, showing that these can be interpreted as a low-dimensional representation of subtle geometrical features present in metal site structures.

MetalHawk: Enhanced Classification of Metal Coordination Geometries by Artificial Neural Networks / Sgueglia, Gianmattia; Vrettas, Michail D.; Chino, Marco; DE SIMONE, Alfonso; Lombardi, Angela. - In: JOURNAL OF CHEMICAL INFORMATION AND MODELING. - ISSN 1549-9596. - (2023). [10.1021/acs.jcim.3c00873]

MetalHawk: Enhanced Classification of Metal Coordination Geometries by Artificial Neural Networks

Gianmattia Sgueglia
Co-primo
;
Marco Chino;Alfonso De Simone
;
Angela Lombardi
2023

Abstract

: The chemical properties of metal complexes are strongly dependent on the number and geometrical arrangement of ligands coordinated to the metal center. Existing methods for determining either coordination number or geometry rely on a trade-off between accuracy and computational costs, which hinders their application to the study of large structure data sets. Here, we propose MetalHawk (https://github.com/vrettasm/MetalHawk), a machine learning-based approach to perform simultaneous classification of metal site coordination number and geometry through artificial neural networks (ANNs), which were trained using the Cambridge Structural Database (CSD) and Metal Protein Data Bank (MetalPDB). We demonstrate that the CSD-trained model can be used to classify sites belonging to the most common coordination numbers and geometry classes with balanced accuracy equal to 96.51% for CSD-deposited metal sites. The CSD-trained model was also found to be capable of classifying bioinorganic metal sites from the MetalPDB database, with balanced accuracy equal to 84.29% on the whole PDB data set and to 91.66% on manually reviewed sites in the PDB validation set. Moreover, we report evidence that the output vectors of the CSD-trained model can be considered as a proxy indicator of metal-site distortions, showing that these can be interpreted as a low-dimensional representation of subtle geometrical features present in metal site structures.
2023
MetalHawk: Enhanced Classification of Metal Coordination Geometries by Artificial Neural Networks / Sgueglia, Gianmattia; Vrettas, Michail D.; Chino, Marco; DE SIMONE, Alfonso; Lombardi, Angela. - In: JOURNAL OF CHEMICAL INFORMATION AND MODELING. - ISSN 1549-9596. - (2023). [10.1021/acs.jcim.3c00873]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/947017
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact