Please note that the source article has been published in 2008. Usage statistics and Tor network technology have evolved since. We provide this summary because of the landmark status of the original publication and the historical significance of its information.
Tor is a popular privacy-enhancing system which conceals and protects its users’ internet usage from network surveillance. It directs Internet traffic through a carefully constructed three-hop path among a worldwide network of volunteer relays, using a layered encryption strategy.
McCoy et al. set up their own Tor router and analyzed passing data to observe who is using the service and how. They deployed their router from December 15-19, 2007 and from January 15-30, 2008, capturing sending address information and identifying which protocols were used. The authors implemented a logging detection technique, identifying malicious Tor routers targeting insecure protocols to capture usernames and passwords.
While internet browsing traffic comprised an overwhelming majority of the connections observed, the BitTorrent protocol (a peer-to-peer protocol used to download large files) used a disproportionately high amount of bandwidth in the Tor network. Insecure protocols which transmit login credentials in plain text, from email servers for instance, were also commonly observed. In addition to potentially compromising those accounts, plain text credentials allow malicious Tor routers to trace back all traffic on the same Tor circuit back to the now identified client. The authors found one router logging plain text email traffic, and observed it attempting to login using a name and password pair they purposefully
provided.
The vast majority of clients originated in Germany, with China and the United States providing the next largest number of clients. Germany and the United States together contributed nearly two thirds of all running routers. From the perspective of the researchers’ relay, the top 2% of all routers in the Tor network transported about 50% of the traffic, while the bottom 75% together transported just 2%.
Results suggest that tunneling insecure protocols like email over Tor presents an important risk to the initiating client’s anonymity. Location diversity in the distribution of Tor routers, although desirable to enhance privacy, is difficult to guarantee because of their high concentration in a few countries. Since the vast majority of Tor traffic is handled by a very small set of routers, an adversary controlling a set of the highest performing routers would be able to conduct traffic analysis and defeat the network’s anonymizing properties. Incentive programs to encourage volunteers to run routers in underrepresented countries should be investigated accordingly.
Tor’s lack of diversity in its router location and bandwidth distribution could compromise client anonymity.