Healthcare domain is characterized by a huge amount of data, contained in medical records, reports, test results and so on. In order to give support to healthcare workers and manage relevant data in effective and efficient way, it is important to correctly classify the unstructured parts of text, embedded in the medical documents. In this paper, we propose a classification system for medical records categorization, focused on the combination of different methodologies, based on lexical, syntactical and semantic analysis of the documents. We will show that a Classification System based on a combination of different text analysis methodologies overcomes the performances of each methodology taken alone. The obtained results will be presented in terms of Accuracy-Rejection Curves. Eventually, pro and cons of the architecture proposed and some future work will be pointed out. © 2014 The authors and IOS Press. All rights reserved.
A study on textual features for medical records classification / Alicante, A.; Amato, Flora; Cozzolino, Giovanni; Gargiulo, F.; Improda, N.; Mazzeo, Antonino. - Studies in Health Technology and Informatics 207:(2014), pp. 370-379. (Intervento presentato al convegno 2nd KES International Conference on Innovation in Medicine and Healthcare, InMed 2014; San Sebastian; Spain; 9 July 2014 through 11 July 2014; Code 109597 tenutosi a San Sebastian; Spain nel 9 July 2014 through 11 July 2014) [10.3233/978-1-61499-474-9-370].
A study on textual features for medical records classification
AMATO, FLORA;COZZOLINO, GIOVANNI;MAZZEO, ANTONINO
2014
Abstract
Healthcare domain is characterized by a huge amount of data, contained in medical records, reports, test results and so on. In order to give support to healthcare workers and manage relevant data in effective and efficient way, it is important to correctly classify the unstructured parts of text, embedded in the medical documents. In this paper, we propose a classification system for medical records categorization, focused on the combination of different methodologies, based on lexical, syntactical and semantic analysis of the documents. We will show that a Classification System based on a combination of different text analysis methodologies overcomes the performances of each methodology taken alone. The obtained results will be presented in terms of Accuracy-Rejection Curves. Eventually, pro and cons of the architecture proposed and some future work will be pointed out. © 2014 The authors and IOS Press. All rights reserved.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.