Exploiting Route Redundancy via Structured Peer to Peer Overlays
Infrastructure-based Resilient Routing Ben Y. Zhao, Ling Huang, Jeremy Stribling, Anthony Joseph and John Kubiatowicz University of California, Berkeley Sahara Winter Retreat, 2004 Challenges Facing Network Applications Network connectivity is not reliable Disconnections frequent in the wide-area Internet IP-level repair is slow Wide-area: BGP 3 mins Local-area: IS-IS 5 seconds Next generation network applications Mostly wide-area Streaming media, VoIP, B2B transactions Low tolerance of delay, jitter and faults Our work: transparent resilient routing infrastructure that adapts to faults in not seconds, but milliseconds February 21, 2020
Sahara Winter Retreat, Jan 2004 [email protected] Talk Overview Motivation A Structured Overlay Infrastructure Mechanisms and policy Evaluation Summary February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected] The Challenge Routing failures are diverse Many causes: Occur anywhere with local or global impact:
Router misconfigurations, cut fiber, planned downtime, protocol implementation bugs Single fiber cut can disconnect AS pairs Isolating failures is difficult Wide-area measurement is ongoing research Single event leads to complex inter-protocol interactions End user symptoms often dynamic or intermittent Requires: Fault detection from multiple distributed vantage points In-network decision making necessary for timely responses February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected] An Infrastructure Approach Our goals
Overlay focused on resiliency Route around failures to maintain connectivity Respond in milliseconds (react instantaneously to faults) Our approach Large-scale infrastructure for fault and route discovery Nodes are observation points (similar to Platos NEWS service) Nodes are also points of traffic redirection (forwarding path determination and data forwarding) Automated fault-detection and circumvention No edge node involvement: fast response time, security focused on infrastructure Fully transparent, no application awareness necessary February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected] An Illustration Goal: fast fault detection and route-around Qwest Backbone
Key: on the fly in-network traffic redirection February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected] Why Structured Overlays Resilient Overlay Networks (MIT) Fully connected mesh Allows each node full knowledge of network D Fast, independent calculation of routes Nodes can construct any path, maximum flexibility Cost of flexibility
Protocol needs to choose the right route/nodes Per node O(n) state Monitors n - 1 paths O(n2) total path monitoring is expensive February 21, 2020 Sahara Winter Retreat, Jan 2004 S [email protected] The Big Picture v v v v v v OVERLAY v v
v v v v v Internet Locate nearby overlay proxy Establish overlay path to destination host Overlay traffic routes traffic resiliently February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected] Traffic Tunneling Legacy Node A A, B are IP addresses P(B) B register register Proxy
Legacy Node B Proxy P(A) = A get (hash(B)) P(B) put (hash(A), P(A)) P(B) = B put (hash(B), P(B)) Structured Peer to Peer Overlay Store mapping from end host IP to its proxys overlay ID Similar to approach in Internet Indirection Infrastructure (I3) February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected] Tradeoffs of Tunneling via P2P Less neighbor paths to monitor per node: O(log(n))
Large reduction in probing bandwidth: O(n) O(log(n)) Faster fault detection with low bandwidth consumption Actively maintain path redundancy Manageable for small # of paths Redirect traffic immediately when a failure is detected Eliminate on-the-fly calculation of new routes Restore redundancy when a path fails Fast fault detection + precomputed paths = increased responsiveness Cons: overlay imposes routing stretch (mostly < 2) February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected] In-network Resiliency Mechanisms Efficient fault detection
Use soft-state to periodically probe log(n) neighbor paths Small number of routes reduced bandwidth Exponentially weighted moving average for link quality estimation Simple approach taken, ongoing research available Avoid route flapping due to short term loss artifacts Loss rate Ln = (1 - ) Ln-1 + p Smart fault-detection / propagation (Zhuang04) Intelligent and cooperative path selection (Seshardri04) Maintaining backup paths Each hop has flexible routing constraint Create and store backup routes at node insertion Restore redundancy via intelligent gossip after failures Simple policies to choose among redundant paths February 21, 2020
Sahara Winter Retreat, Jan 2004 [email protected] First Reachable Link Selection (FRLS) Use estimated loss results to choose shortest usable path Sort next hop paths by latency Use shortest path with minimal quality > T Correlated failures Reduce with intelligent topology construction Key is to leverage redundancy available February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected] Evaluation
Metrics for evaluation How much routing resiliency can we exploit? How fast can we adapt to faults (responsiveness)? Experimental platforms Event-based simulations on transit stub topologies Data collected over multiple 5000-node topologies PlanetLab measurements Microbenchmarks on responsiveness More details in paper (ICNP03) and poster session February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected] % of All Pairs Reachable Exploiting Route Redundancy 1 (Sim)
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Instantaneous IP 0 0.05 Tapestry / FRLS 0.1 0.15 0.2 Proportion of IP Links Broken Simulation of Tapestry, 2 backup paths per routing entry Transit-stub topology shown, results from TIER and AS graphs similar February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected]
Responsiveness to Faults (PlanetLab) Time to Switch Routes (ms) 2500 2000 1500 1000 660 alpha=0.2 alpha=0.4 500 0 0 200 300 400 600 800 1000 1200
Link Probe Period (ms) Response time increases linearly with probe period Minimum link quality threshold T = 70%, 20 runs per data point February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected] Link Probing Bandwidth (Planetlab) Bandwidth Per Node (KB/s) 7 PR=300ms PR=600ms 6 5 4 3 2 1 0 1 10 100 1000
Size of Overlay Medium sized routing overlays incur low probing bandwidth Bandwidth increases logarithmically with overlay size February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected] Conclusion Pros and cons of infrastructure approach Structured routing has low path maintenance costs Transparent to user applications Can no longer construct arbitrary paths Structured routing with low redundancy close to ideal connectivity Incur low routing stretch Fast enough for highly interactive applications
Allows caching of backup paths for quick failover 300ms beacon period response time < 700ms On overlay networks of 300 nodes, b/w cost is 7KB/s Ongoing questions Is there lower bound on desired responsiveness? Should we use multipath redundant routing for resilience? How to deploy as a single network across ISPs? VPN-like routing service? February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected] Related Work Redirection overlays
Topology estimation techniques Detour (IEEE Micro 99) Resilient Overlay Networks (SOSP 01) Internet Indirection Infrastructure (SIGCOMM 02) Secure Overlay Services (SIGCOMM 02) Adaptive probing (IPTPS 03) Internet tomography (IMC 03) Routing underlay (SIGCOMM 03) Many, many other structured peer-to-peer overlays Thanks to Dennis Geels / Sean Rhea for their work on BMark February 21, 2020 Sahara Winter Retreat, Jan 2004 [email protected]
Traditional channel system: members make little or no effort to cooperate with each other. Channel specialization can lead to conflict that gets in the way. Vertical conflict. is conflict between two firms at different levels in the channel of distribution,...
Designing & Delivering Whole-Person Transitional CareThe Hospital Guide to Reducing Medicaid Readmissions. Webinar 5:Reach Out to Collaborate with Partners Across Settings ... Cast aside the assumption that "there are no resources in our community"
Multicore New languages aren't necessary Legacy code easily adjusted Manycore Implicitly Parallel Sequential Programming No optimization for sequential (custom allocators) Points of non-determinism specified Parallel algorithms in sequential codes Debuggability, Understandability, Sanity The Answer Originates with ATE The Old Way:...
Zeus is the king of the gods, the ruler of Mount Olympus and the god of the sky and thunder. His symbols are the thunderbolt, eagle, bull, and oak. His siblings: Posiedon, Hades, Hestia, Demeter, and Hera Had many affairs...
Reuters Use words to Form Sentences maganda, anak, ate rhoda, kuya Jonathan malaki, bahay, may, nanay, tatay, ko matatalino, mga estudyante, klase mabait, kasintahan, niya malakas, Ben, kahanga-hanga, Banawe Rice Terraces, Pilipinas may, pula, damit, Mila, bahay Review Notes for...
Specifically, $205 million in goods and services (45% of all purchases), were not required to be competitively bid because their value was below the cost threshold of $25,000." 10 10 Agencies may have used informal competitive processes, such as soliciting...
Individual Psychotherapy Introduction Types of Therapy Wohlberg (1967) Supportive. ... the student comes either to the desired knowledge by answering the questions or to a deeper awareness of the limits of knowledge. Common Coping Skills Developed in CBT Self-monitoring. Relaxation...
Ready to download the document? Go ahead and hit continue!