Privacy-preserving protocols and tools are increasingly adopted by Internet users nowadays. These mechanisms are challenged by the process of traffic classification (TC) which, other than being an important workhorse for several network management tasks, becomes a key factor in the assessment of their privacy level, both from offensive (malign) and defensive (benign) standpoints. In this paper, we propose TC of anonymity tools (and deeper, of their running services and applications) via a truly hierarchical approach. Capitalizing a public dataset released in 2017 containing anonymity traffic, we provide an in-depth analysis of TC and we compare the proposed hierarchical approach with a flat counterpart. The proposed framework is investigated in both the usual TC setup and its “early” variant (i.e., only the first segments of traffic aggregate are used to take a decision). Results highlight a general improvement over the flat approach in terms of all the classification metrics. Further performance gains are also accomplished by tuning the thresholds ensuring progressive censoring. Finally, fine-grain performance investigation allows us to demonstrate lower severity of errors incurred by the hierarchical approach (as opposed to the flat case) and highlight poorly classifiable services/applications of each anonymity tool, gathering useful feedback on their privacy level.

A Dive into the Dark Web: Hierarchical Traffic Classification of Anonymity Tools

Pescape, Antonio
;
Bovenzi, Giampaolo;Montieri, Antonio;Ciuonzo, Domenico;Persico, Valerio
2020

Abstract

Privacy-preserving protocols and tools are increasingly adopted by Internet users nowadays. These mechanisms are challenged by the process of traffic classification (TC) which, other than being an important workhorse for several network management tasks, becomes a key factor in the assessment of their privacy level, both from offensive (malign) and defensive (benign) standpoints. In this paper, we propose TC of anonymity tools (and deeper, of their running services and applications) via a truly hierarchical approach. Capitalizing a public dataset released in 2017 containing anonymity traffic, we provide an in-depth analysis of TC and we compare the proposed hierarchical approach with a flat counterpart. The proposed framework is investigated in both the usual TC setup and its “early” variant (i.e., only the first segments of traffic aggregate are used to take a decision). Results highlight a general improvement over the flat approach in terms of all the classification metrics. Further performance gains are also accomplished by tuning the thresholds ensuring progressive censoring. Finally, fine-grain performance investigation allows us to demonstrate lower severity of errors incurred by the hierarchical approach (as opposed to the flat case) and highlight poorly classifiable services/applications of each anonymity tool, gathering useful feedback on their privacy level.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11588/747326
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 39
  • ???jsp.display-item.citation.isi??? 28
social impact