Visualization of Peer to Peer Networks

Infoviz Initial Project Proposal
October 19, 2000


Team Members:
Rachna Dhamija, Danyel Fisher, Ka-Ping Yee

Goals: We propose to visualize a peer to peer network (such as Gnutella or Freenet). There are two directions this project could take:

1) We can incorporate the network visualization into an interface to help users search and retrieve files and help them find out information about other nodes (e.g., how reliable they are, how likely they are to have interesting content).

2) We can collect and visualize network data to help researchers/system designers answer questions such as:

network topology- how does it change over time, how long are people connected?

information flow- how do files propagate within and across networks?

trust metrics- how is trust determined in this environment? which nodes act as introducers?

user metrics- who are the users, what is their bandwidth, what clients are they using, how does behavior change over time?

survivability- what is the degree of connectedness? how tolerant is the network to error and attack?

scalability- where do bottlenecks occur, how can routing be optimized?

queries- are queries satisfied? how long does it take?

 

What steps (and tools) will be required to accomplish goals:

1) First we need to formulate a tighter research question and choose one network to investigate (e.g., Gnutella, Freenet)

2) Next we need to get data! It is possible to instument a Gnutella client to collect data (once instrumented, we would be able to gather a significant amount of data over a short time period). It may also be possible to obtain data from a Freenet server that has been running for some time.

3) Data processing and massage.

4) Exploration (preferably using existing existing viz tools, but this depends on how ambitious we are.)

 

Some previous work we can build on:

Free Riding on Gnutella- Eytan Adar and Bernardo Huberman, Xerox PARC An analysis of user traffic on Gnutella shows a significant amount of free riding in the system. By sampling messages on the Gnutella network over a 24-hour period, we established that almost 70% of Gnutella users share no files, and nearly 50% of all responses are returned by the top 1% of sharing hosts.

Steve G. Steinberg's map of the Gnutella network - Steinberg modified a Gnutella client to perform the equivalent of traceroute and created the map using Graphviz. This graph was created during a static point in time, from the point of view of one node.

Bandwidth Barriers to Gnutella Network Scalability DSS Clip2, September 8, 2000- The scalability of a Gnutella network to accommodate more users performing more searches is limited by the lowest bandwidth links prevalent within the network.

Error and attack tolerance of complex networks Reka Albert, Hawoong Jeong & Albert-Laszlo Barabasi University of Notre Dame, Nature July 2000 (PDF) -The authors find that scale-free networks, including the Internet, display an unexpected degree of robustness- the ability of their nodes to communicate being unaffected even by un-realistically high failure rates. However, error tolerance comes at a high price in that these networks are extremely vulnerable to attacks (that is, to the selection and removal of a few nodes that play a vital role in maintaining the network’s connectivity).