In this work, a methodology for semi-automatic derivation of knowledge from document collections is proposed. In order to extract relevant information from documents, a process integrating both statistical and lexical approaches is applied. We propose a strategy for the semantic evaluation of the index terms extracted in order to ensure a good correspondence between the information searched for and the information retrieved. Therefore, we propose a system for the peculiar lexicon extraction and assessment. The system can be used for defining an ontological model to be used in the semantic processing of a corpus of documents belonging to a specialist domain.
A method for the evaluation of the peculiar lexicon significance / Amato, Flora; Mazzeo, Antonino; Scippacercola, Sergio. - In: ELECTRONIC JOURNAL OF APPLIED STATISTICAL ANALYSIS: DECISION SUPPORT SYSTEMS AND SERVICES EVALUATION. - ISSN 2037-3627. - 2:1(2011), pp. 54-64. [10.1285/i2037-3627v2n1p54]
A method for the evaluation of the peculiar lexicon significance
AMATO, FLORA;MAZZEO, ANTONINO;SCIPPACERCOLA, SERGIO
2011
Abstract
In this work, a methodology for semi-automatic derivation of knowledge from document collections is proposed. In order to extract relevant information from documents, a process integrating both statistical and lexical approaches is applied. We propose a strategy for the semantic evaluation of the index terms extracted in order to ensure a good correspondence between the information searched for and the information retrieved. Therefore, we propose a system for the peculiar lexicon extraction and assessment. The system can be used for defining an ontological model to be used in the semantic processing of a corpus of documents belonging to a specialist domain.File | Dimensione | Formato | |
---|---|---|---|
6455-12627-1-PB(1).pdf
non disponibili
Tipologia:
Documento in Post-print
Licenza:
Accesso privato/ristretto
Dimensione
636.81 kB
Formato
Adobe PDF
|
636.81 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.