The use of pervasive IoT devices in Smart Cities, have increased the Volume of data produced in many and many field. Interesting and very useful applications grow up in number in E-health domain, where smart devices are used in order to manage huge amount of data, in highly distributed environments, in order to provide smart services able to collect data to fill medical records of patients. The problem here is to gather data, to produce records and to analyze medical records depending on their contents. Since data gathering involve very different devices (not only wearable medical sensors, but also environmental smart devices, like weather, pollution and other sensors) it is very difficult to classify data depending their contents, in order to enable better management of patients. Data from smart devices couple with medical records written in natural language: we describe here an architecture that is able to determine best features for classification, depending on existent medical records. The architecture is based on pre-filtering phase based on Natural Language Processing, that is able to enhance Machine learning classification based on Random Forests. We carried on experiments on about 5000 medical records from real (anonymized) case studies from various health-care organizations in Italy. We show accuracy of the presented approach in terms of Accuracy-Rejection curves.

Enhancing random forest classification with NLP in DAMEH: A system for DAta Management in eHealth Domain / Amato, F.; Coppolino, L.; Cozzolino, G.; Mazzeo, G.; Moscato, F.; Nardone, R.. - In: NEUROCOMPUTING. - ISSN 0925-2312. - 444:(2021), pp. 79-91. [10.1016/j.neucom.2020.08.091]

Enhancing random forest classification with NLP in DAMEH: A system for DAta Management in eHealth Domain

Amato F.
;
Coppolino L.
;
Cozzolino G.
;
Moscato F.
;
Nardone R.
2021

Abstract

The use of pervasive IoT devices in Smart Cities, have increased the Volume of data produced in many and many field. Interesting and very useful applications grow up in number in E-health domain, where smart devices are used in order to manage huge amount of data, in highly distributed environments, in order to provide smart services able to collect data to fill medical records of patients. The problem here is to gather data, to produce records and to analyze medical records depending on their contents. Since data gathering involve very different devices (not only wearable medical sensors, but also environmental smart devices, like weather, pollution and other sensors) it is very difficult to classify data depending their contents, in order to enable better management of patients. Data from smart devices couple with medical records written in natural language: we describe here an architecture that is able to determine best features for classification, depending on existent medical records. The architecture is based on pre-filtering phase based on Natural Language Processing, that is able to enhance Machine learning classification based on Random Forests. We carried on experiments on about 5000 medical records from real (anonymized) case studies from various health-care organizations in Italy. We show accuracy of the presented approach in terms of Accuracy-Rejection curves.
2021
Enhancing random forest classification with NLP in DAMEH: A system for DAta Management in eHealth Domain / Amato, F.; Coppolino, L.; Cozzolino, G.; Mazzeo, G.; Moscato, F.; Nardone, R.. - In: NEUROCOMPUTING. - ISSN 0925-2312. - 444:(2021), pp. 79-91. [10.1016/j.neucom.2020.08.091]
File in questo prodotto:
File Dimensione Formato  
2021-07 Neurocomputing.pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: Accesso privato/ristretto
Dimensione 1.71 MB
Formato Adobe PDF
1.71 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/858440
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? 8
social impact