Visualization of Peer to Peer Networks

Infoviz Revised Project Proposal
October 31, 2000


Team Members:
Rachna Dhamija, Danyel Fisher, Ka-Ping Yee

Project goals: Our goal is create an interface to visualize peer to peer networks. The challenge is to assemble data about distributed nodes, data and messages to convey a comprehensive picture of the network to users.

Target Users: Our target user population includes researchers who wish to investigate properties of the network and system designers who wish to see the impacts of design choices on the network. End users of the system may also find several visualization features useful to learn about the network and to enhance their search performance.

Data gathering and choice of network to visualize:

We have successfully been able to log Gnutella network data, by both instrumenting a Gnutella client and also by writing our own client to collect data. We have been able to use this client as a probe to issue queries and pings to other nodes and to monitor and log all messages that are sent through us by other nodes (search queries, search responses, pings and pongs). We are now looking at ways to strategically place several probes at different points in the network to sample data.

Freenet is also an interesting network to visualize (because more attention has been given to routing and security and some attention is now being given to search and metadata). The view of the Freenet network from the point of view of one node is very limited, which is why designers desire an aggregate view of the network. Some disadvantages for us are that Freenet has a smaller network of users and that real data is harder to collect. To obtain a picture of the network, we need to collect data from a large number of nodes (to do this we must update the Freenet client to log data and then ask nodes to use this version to send us their logs). In this case, our strategy would be to first use simulated data to prototype a visualization and then collect real data.

For these reasons, we have decided to initially develop a visualization based on real Gnutella data and to include Freenet data later if we have time.

Description of the visualization: We intend to use existing network and graph visualization tools and to adapt them to our specific needs.

Challenges include:

- how to visualize extremely large amounts of data in a coherent way
- how to visualize a quickly changing network
- how to customize existing visualization techniques to meet our tasks

Specific tasks include:

1) Obtain information about:

individual nodes (IP address, connection speed)

node behavior over time - reliability, % of requests answered

messages: individual messages, related messages, e.g., paths that searches and replies take

the network as a whole: topology, connectedness, growth, performance

content: how content is replicated

2) Trouble shooting: can we use the visualization to give us insight on weaknesses of the system?

how to improve routing, where and why do bottlenecks occur?

spotting bad behavior and spam

privacy/anonymity weaknesses: how easy is it to track content distributors/searchers? if the network is anonymous, can we break this?

security/resistance to attack: what percentage of nodes can be removed, how will the network recover?

3) Demonstrate how Gnutella/Freenet works- As an end user, it is hard to understand the operation of the network. It is likely that some of the visualization techniques we develop to answer specific questions that researchers/system designers have, will also be useful to end users in demonstrating how the network works and what the effects of their actions are. For example, users may ask:

How many nodes am I connected to?
What is the path that my search queries take?
What happens when I adjust certain parameters, such as the breadth of my search or the number of nodes that can connect to me?

 

Project Timeline:

Milestones Date(s)
Literature Review October 27
Task analysis: formulate tasks and specific research questions. In particular, we are interested in collecting and visualizing data in order to refute or verify claims that have made about the network. Nov 1
Gather and clean data Nov 3
Review of existing network and graph viz tools Nov 6
Prototype visualization ideas Nov 10
Implement interface Nov 20
Evaluate interface - use the interface to analyze data. Are we able to refute or verify any claims made about the network? Nov 20 - Nov 30
Final class presentation Nov 30, Dec 5 or Dec 7
Final Paper Dec 9

 

This document: http://www.sims.berkeley.edu/~rachna/courses/infoviz/proposal2.html
Last Modified: October 31, 2000