Anonymizing networks such as Tor arouse increasing interest in users careful about their anonymity and privacy. Despite a common belief in the research community that anonymizing networks are used to avoid political censorship and allow freedom of speech, few works have explored how Tor is actually used in the wild.
Chaabane et al. analyzed Tor traffic to characterize its actual use and identify potentially undesirable behaviors which could affect the network’s operations. Tor clients typically build a communications circuit with 3 computers in the network. Messages are wrapped in successive layers of encryption (like a locked box in a locked box in a...). Each computer is provided with a decryption key so that the message can be unwrapped when it progresses through the circuit in the correct order. The last computer (the exit node) unwraps the message and sends it on to its destination. Each computer passing the message only knows about the computer it received it from. Only the final computer can decrypt the original message, without any information as to its origins.
The researchers created six Tor exit nodes distributed around the world and monitored their use on two separate occasions between December 2009 and January 2010, for a total of 23 days. They used Deep Packet Inspection to analyze both the metadata and the content of messages, to identify the services being used on the network. The researchers were careful to conceal or minimize their use of this technique to the minimum required in order to preserve user privacy as much as possible.
The results showed a significant amount of unknown traffic, that further analysis determined to likely be concealed BitTorrent data. This information, combined with the identifiable BitTorrent traffic already detected on the network, suggests that more than half of the traffic on Tor is from BitTorrent activity. Analysis of what is downloaded through Tor indicates that most of that content is restricted under copyright laws.
Other uses of Tor include web browsing, which appeared to be used much like the open internet, despite the slower performance caused by the anonymization process. As expected, search engine services are the most visited pages, followed by pornography in second. Social network websites are the fourth most visited. The small number of Tor users in politically sensitive countries suggests that this traffic represents American and European employees circumventing restrictions on their work computers rather than political dissidents organizing protest actions. Germany and the United States represented a quarter of all Tor clients, with Russia and China in fifth and sixth place. Although 100 countries appeared to use the Tor network in the one-day period for which data was collected, 70% of that traffic came from just ten countries.
The researchers also detected several systems using the Tor network in a manner that it was not designed for, passing messages through one computer rather than a circuit of three or more. While this reduces the anonymizing power of Tor, it also increases the speed of connections while providing an encrypted tunnel to an exit node. This could be a way for users to overcome connection filtering, such as it is employed within companies. It so appears that a significant portion of Tor usage is not to overcome restrictive internet and government regulation but rather to avoid detection for content piracy and local network filtering.
More than half of the traffic on Tor is likely BitTorrent or illegal content downloads.