Making peer-to-peer systems scalable Presented by Elisavet Kozyri What is a peer-to-peer system? A distributed application architecture that partitions tasks or work loads between peers Main actions: Find the owner of the file (indexing) Get the file from the owner
Popular P2P Systems indexing Napster (1999 2001) Gnutella (2000 now) get file What was it missing? Scalable indexing mechanism
Goals Each node (peer) should be responsible for certain files System remain robust during and after the arrival and departure of nodes Observation At the new system: Given an identifier of data it should find the owner. If a node joins or leaves it should rearrange the data. Is it similar to hash table? node bucket
identifier key data value Distributed hash table Distributed Hash Table Abstraction: Simple interface Similar to hash table Pairs (key, value) are spread across the nodes Implementation:
Name space partitioning Each node responsible for a fraction Each node has ID from the same name space Nodes communicate over an overlay network to route lookup requests Feb abcd
Jun bcd Apr bcde Jul cde Jan
May Sep Nov ebcd Chord: A DHT Implementation Chord: Authors Ion Stoica
Associate Professor of CS at Berkeley Robert Morris Professor in the EECS department at MIT David Karger Professor in the EECS department at MIT M. Frans Kaashoek Professor in the EECS department at MIT Hari Balakrishnan Professor in the EECS department at MIT
Chord: Goals Load balance Keys are spread evenly over the nodes Decentralization No node is more important than any other Scalability The cost of lookups grows as the log of the number of nodes Availability The node responsible for the key can always be found
Flexible naming No constraints on the structure of the keys Chord: Protocol Hashing Determines the name space and the way it is partitioned among nodes. Routing Determines the way lookup requests will be routed among nodes to reach their destination.
Joining Determines the way the system adopts itself at the arrival of a node Chord: Hashing Consistent hash function Each node and key has an m-bit identifier Identifiers ordered in an identifier circle Key k belongs to the node which identifier is the first clockwise from k
Each node is randomly assigned a 128-bit identifier (nodeID) Each object is assigned an identifier at least 128 bits long (objID) An object belongs to a node if nodeID is numerically closest to the 128 most significant bits of the objID An object is stored on k nodes Pastry: Hashing 00100 001
11010 01010 110 011 10100 100 10011 01101
Pastry: Routing nodeID Level 0 1 1 Level 1 0 0 1
Level 2 0 0 0 Level 3 1 1
1 1 A message whose destID matches the local nodes nodeID up to level l is forwarded to a node whose nodeID matches the destID up to least l+1. Routing table For each level l, the routing table contains the IP address of 2b-1 nodes that have the same nodeID prefix as the local node up to level l-1, but differ at level l. From all possible nodes, the closest are selected.
CAN Content-Addressable Network CAN: Hashing d-torus Each node owns a zone within the overall space A key is mapped onto a point in the space If the point P belongs to the zone of node n, then the corresponding (key, value) is stored at n.
CAN: Routing Routing table: IP address and virtual coordinate zone of each immediate neighbors A node forwards the message to the neighbor with coordinates closest to destination Comparison of DHT Geometries The Impact of DHT Routing Geometry of
Resilience and Proximity K. Gummadi: Head of Networked Systems Research Group at Max Planck Institute for Software Systems R. Gummadi: Assistant Professor, ECE, UMass Amherst S. Gribble: Associate Professor, CSE, University of Washington S. Ratnasamy: Researcher at Intel Research Berkeley
S. Shenker: Professor, EECS, UC Berkeley I. Stoica: Associate Professor, CS, US Berkeley Comparison of DHT Geometries Resilience The ability of the DHT implementation to route during and after arrivals/departures of nodes. Proximity The ability of the DHT implementation to adapt to the underlying Internet topology. Flexibility
Neighbor Selection Route Selection Tree 000 001 010 011 100
PRR, Tapestry Generous flexibility in choosing neighbors No flexibility in route selection 101 110 111 Hypercube 110 100 111
101 010 000 011 001 CAN Flexibility in route selection No flexibility in choosing neighbors Ring
000 Chord Changes ith neighbor of a belongs to [(a + 2i),(a+ 2i+1)] multiple paths 111 001 110
010 101 011 100 Flexibility in route selection Flexibility in choosing neighbors Hybrid
Pastry (ring + tree) Flexibility in route selection Flexibility in choosing neighbors Static Resilience Static resilience Routing flexibility Proximity Proximity Neighbor Selection (PNS) Proximity Route Selection (PRS) What is the best? Discussion
Are ring geometries the best? What is the importance of sequential neighbors? How does neighbors flexibility influence resilience/proximity? Chord, CAN, Pastry: Are they used today? Which is the best? General comments References Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems, Antony Rowston, Peter Druschel
A Scalable Content-Addressable Network, Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard karp, Scott Shenker Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, David Liben-Nowell, David Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan Geometry shapes from Krishnas SIGCOMM talk Thank you!
Ginkgo (the "smart pill") Lecithin (brain food), Lycopene (antioxidant from tomato—100x more effective than vitamin E) Alpha lipoic acid (anti-aging/memory) Rhododendrencaucasicum(60x antioxidant protection of Vitamin E and 20x that of C)
Develop real time spray monitoring system. Build / design required hardware interfaces. Design a geospatial database to store all transmitted data. ... Raven Industries Flow Meter. Hand spray gun with custom wiring harness and waist battery pack.
Humor In conclusion, the subject matter that people choose to joke about goes a long ways in determining whether they are using humor as release of moderate levels of discomfort, or whether they are really engaging in "hate-speech" disguised as...
Blockchains are distributed ledgers. When ledger technologies change, so do economies. Blockchain is not just a new general purpose technology. Blockchain is a new institutional mechanism to organise economic activity and behaviour
FGM - practiced in UK and US until 1950s as 'cure' for masturbation, lesbianism and promiscuity (Koso-Thomas, O. (1987). ... Change in the manner of dress: for example, clothes which do not suit the climate which may be used to...