In this paper, we propose SemTree, a novel semantic index for supporting retrieval of information from huge amount of document collections, assuming that semantics of a document can be effectively expressed by a set of (subject, predicate, object) statements as in the RDF model. A distributed version of KD-Tree has been then adopted for providing a scalable solution to the document indexing, leveraging the mapping of triples in a vectorial space. We investigate the feasibility of our approach in a real case study, considering the problem of finding inconsistencies in documents related to software requirements and report some preliminary experimental results.
SemTree: An index for supporting semantic retrieval of documents / Amato, Flora; De Santo, Aniello; Gargiulo, Francesco; Moscato, Vincenzo; Persia, Fabio; Picariello, Antonio; Poccia, Silvestro Roberto. - (2015), pp. 62-67. (Intervento presentato al convegno 31st IEEE International Conference on Data Engineering Workshops, ICDE 2015 tenutosi a Seoul (Korea) nel 13-17 April, 2015) [10.1109/ICDEW.2015.7129546].
SemTree: An index for supporting semantic retrieval of documents
AMATO, FLORA;MOSCATO, VINCENZO;PICARIELLO, ANTONIO;
2015
Abstract
In this paper, we propose SemTree, a novel semantic index for supporting retrieval of information from huge amount of document collections, assuming that semantics of a document can be effectively expressed by a set of (subject, predicate, object) statements as in the RDF model. A distributed version of KD-Tree has been then adopted for providing a scalable solution to the document indexing, leveraging the mapping of triples in a vectorial space. We investigate the feasibility of our approach in a real case study, considering the problem of finding inconsistencies in documents related to software requirements and report some preliminary experimental results.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.