P2P
Peer-to-peer (P2P) is a type of transient Internet network that allows a group of computer users with the same networking program to connect with each other and directly access files from one another's hard drives.
Gnutella is an example of peer-to-peer software in which individuals can directly exchange files over the Internet. Gnutella is fully decentralized so that a user connected to the network may access the files of other users on the network. Users serve as both clients and servers connected in a daisy-chain fashion Since there are no central servers there may be a high bandwidth requirement.
(The P2P and Gnutella definitions above are from http://whatis.techtarget.com)
Gnutella Protocol Descriptors
Descriptor Header
Descriptor ID (0-15) | Payload Descriptor (16) | TTL (17) | Hops (18) | Payload Length (19-2) |
A 16-byte string uniquely identifying the descriptor on the network.
0x00 = Ping; 0x01 = Pong; 0x40 = Push; 0x80 = Query; 0x81 = QueryHit.
Time To Live. The number of times the descriptor will be forwarded by Gnutella servents before it is removed from the network. Each servent will decrement the TTL before passing it on to another servent. When the TTL reaches 0, the descriptor will no longer be forwarded. The TTL is the only way to remove descriptors from the network and if it is left unmonitored, high network traffic and poor performance will likely result.
The number of times the descriptor has been forwarded. The TTL and Hops field must satisfy the following condition as the descriptor is passed from servent to servent:
TTL (0) = TTL (i) + Hops (i),
Where TTL (i) and Hops (i) are the value of the TTL and Hops fields on the header at the descriptor’s i-th hop, for i>=0.
the length of the descriptor immediately following this header. The next descriptor header is located exactly Payload_Length bytes from the end of this header. In other words, there are no gaps in the Gnutella data stream. The Payload Length field is the only way for a servent to find the beginning of the next descriptor in the input stream. This field should be monitored so that the servent remains in synch with its input stream. The connection is dropped is and when the servent becomes out of synch with its input stream.
Ping (0x00)
The purpose of a Ping request is to announce the servent’s presence on the network, or more precisely, to actively probe the network for other servents. It includes a TTL count, which determines how many times the request can be forwarded to other computers. TTL is 7 by default. Ping descriptors have no payload and are of zero length.
Port (0-1) | IP Address (2-5) | Number of Files Shared (6-9) | Number of KiloBytes Shared (10-13) |
The port number on which the responding host can accept incoming connections.
The IP address of the responding host.
The number of files that the servent with the given IP address and port is sharing on the network. Number of Kilobytes Shared: the number of kilobytes of data that the servent with the given IP address and port number is sharing on the network.
A Pong descriptor is only sent in response to an incoming Ping descriptor. More than one Pong may be sent in response to one Ping, which enables the host caches to send cached servent address information.
Minimum Speed (0-1) | Search Criteria (2-...) |
The minimum speed, in kilobits per second, of servents that can respond to this message. A servent receiving a Query descriptor with a minimum Speed field of n kb/s should only respond with a QueryHit if it is able to communicate at a speed greater than or equal to n kb/s.
A null (i.e. 0x00) terminated search string. The maximum length of this string is bounded by the Payload_Length field of the descriptor header.
Number of Hits (0) | Port (1-2) | IP Address (3-6) | Speed (7-10) | Result Set (11-...) | Servent Identifier (n-n+16) |
The number of query hits in the Result Set. Port: The port number on which the responding host can accept incoming connections.
The IP address of the responding host.
The speed, in kb/s, of the responding host.
A set of responses to the corresponding Query. This set contains Number_of_Hits elements, each with the following structure:
File Index (0-3) | File Size (4-7) | File Name (8-...) |
File Index
A number assigned by the responding host that uniquely identifies the file.
The size of the result set is bounded by the size of the Payload_Length field in the Descriptor Header.
A 16-byte string uniquely identifying the responding servent on the network. This is typically some function of the servent’s network address. The Servent Identifier is instrumental in the operation of the Push Descriptor. QueryHit descriptors are only sent in response to an incoming Query descriptor. A servent should only reply to a Query with a QueryHit if it contains data that strictly meets the Query Search Criteria. The Descriptor_ID field in the Descriptor Header of the QueryHit should contain the same value as that of the associated Query descriptor. This allows a servent to identify the QueryHit descriptors associated with Query descriptors it generated.
Servent Identifier (0-15) | File Index (16-19) | IP Address (20-23) | Port (24-25) |
The 16-byte string uniquely identifying the servent on the network who is being requested to push the file with index File_Index. The servent initiating the push request should set this field to the Servent_Identifier returned in the corresponding QueryHit descriptor. This allows the recipient of a push request to determine whether of not it is the target of that request.
The index uniquely identifying the file to be pushed from the target servent. The servent initiating the push request should set this field to the value of one of the File_Index fields form the Result Set in the corresponding QueryHit descriptor.
The IP address & port of the host to which the file with File_Index should be pushed.
A servent may send a Push descriptor if it receives a QueryHit descriptor from a servent that does not support incoming connections. This might occur when the servent sending the QueryHit descriptor is behind a firewall. When a servent receives a Push descriptor, it may act upon the push request if and only if the Servent_Identifier field contains the value of its servent identifier. The Descriptor_ID field in the Descriptor Header of the Push descriptor should not contain the same value as that of the associated QueryHit descriptor, but should contain a new value generated by the servent’s Descriptor_ID generation algorithm.
The first step to connect a Gnutella servent to the network begins by establishing a connection with another servent currently on the network in order to obtain the servent’s IP address. Once the address is obtained, a TCP/IP connection to the servent is created and the Gnutella connection request string is sent. The handshake message “GNUTELLA CONNECT/0.4\n\n is sent to the other peer, who then responds with “GNUTELLA OK\n\n”. Connections may be rejected, for example, because the versions are not compatible or because that particular servent already has too many connections.
Once a servent receives a QH descriptor, it may initiate the direct download of one of the files described by the descriptors Result Set. Files are downloaded out-of-network (i.e. a direct connection between the source and target servent is established in order to perform the data transfer). File data is never transferred over the Gnutella network. The file download protocol is HTTP. The servent initiating the download sends a request string of the following form to the target server:
The server receiving this download request responds with HTTP 1.0 compliant headers such as
The file data then follows and should be read up to and including the number of bytes specified in the Content-length provided in the server’s HTTP response. The HTTP Range parameter is used so that interrupted downloads may be resumed at the point where they terminated.
If a direct connection to download from a servent cannot be established due to the presence of a firewall, the servent attempting the download may request a file push. The servent with the desired file routs a Push request to the servent requesting the file. Upon receipt of this Push descriptor, the servent should establish a new TCP/IP connection to the requesting servent. If both parties are behind a firewall, the connection cannot be established and the file transfer cannot take place. If a direct connection can be established, the servent behind the firewall sends the following message:
Where
Then the file is transfered just like in a regular download.
In order to properly route replies on the network nodes keep a routing table. This table will keep track of recent traffic. Each table should have a message ID, descriptor ID and connection ID. In this way the node can see where specific descriptors came from. For example, a node will get a Ping from Node X with ID = 5. It will reply with a Pong and also forward that Ping onto its directly connected neighbors. When those neighbors reply the node will have no way of knowing that those Pongs go back to Node X unless it has stored the information. When the node sees a Pong with ID = 5 it knows that this is only in response to a Ping with the same ID. It looks them up in the table and routes the Pong to the source of the original Ping. The same is true for Querys and QueryHits. This reduces erroneous packets on the network. If a node is misbehaving and sending out Pongs for no reason, other nodes will konw to discard those packets because they did not see the corresponding Ping at any time.
Click here to view our in class Presentation.
This applet demonstrates how the Gnutella Protocol works over a network. You can simulate its operation as well as what happens when an error occurs.
Click here to see some popular file sharing programs.
Some outside Links.