Extracting and Classifying Keywords in Textual Data Analysis

Misuraca, M.; Scepi, Germana; Grassia, MARIA GABRIELLA

In this paper, we consider a peculiar lexical table having as general term the number of times the forms of two different vocabularies, collected on the same units, are simultaneously present. On this peculiar matrix, we first of all apply a factorial data analysis method for visualizing and extracting keywords and successively, by means of a co-clustering technique, we identify classes of keywords for the two different corpora. The main results of this strategy are shown by an application on two corpora defined by the language used by a set of firms on their official web sites for describing their core mission and the language they use in searching new employers.

Extracting and Classifying Keywords in Textual Data Analysis / M., Misuraca; Scepi, Germana; Grassia, MARIA GABRIELLA. - In: STATISTICA APPLICATA. - ISSN 1125-1964. - STAMPA. - 17:4(2005), pp. 517-528.