Brocade: Landmark
Routing on Overlay Networks - Ling Huang
Peer-to-peer networks offer attractive features such
as decentralized computing, fault tolerance, sharing and
aggregating of resources. Due to the theoretical approach taken
in these systems, they always assume that most nodes in the system
are uniform in resources. This results in messages being routed
on the overlay with minimum consideration to actual network topology
and differences between node resources. Many theories and
algorithms used by existing peer-to-peer infrastructures comply
with this tenet, and such systems often show sub-optimal performance.
We believe that such system can benefit from some form of internal
organization and differentiation among its nodes, thus
enjoying the efficiency of a well-organized system, while still
providing users with the P2P features. We construct Brocade, and
propose a secondary overlay to be layered on top of existing P2P
systems, that exploits knowledge of underlying network characteristics.
The secondary overlay builds a location layer between "supernodes,"
nodes that are situated near network access points, such as gateways
to administrative domains. By associating local nodes
with their nearby "supernode," messages across the wide-area can
take advantage of the highly connected network infrastructure between
these supernodes to shortcut across distant network domains, greatly
improve point-to-point routing distance and reducing overall
network bandwidth usage. We present the initial architecture
of a brocade secondary overlay on top of a Tapestry network, and
demonstrate its potential performance benefits by simulation.
Exploiting Routing Redundancy Using a Wide-area Overlay -
Ben Zhao
As new and interesting peer-to-peer applications
combine with advancements in networking technology, they are
reaching millions of users across the globe. Numerous studies have
shown, however, that loss of connectivity is common on the wide-area
network, due to hardware and software failures, and network
misconfigurations.
Despite the natural redundancy present in underlying network
links, the current IP layer fails to recognize and recover
from these frequent failures in a timely fashion. This work presents
fault-tolerant routing on the Tapestry overlay network, which
exploits existing network redundancy by dynamically switching
traffic onto precomputed alternate routes. Furthermore, messages
in our system can be duplicated and multicast ``around''
network congestion and failure hotspots with rapid reconvergence
to drop duplicates. Our simulations show fault-tolerant Tapestry
to be highly effective at circumventing link and node failures,
with reasonable cost in terms of additional routing latency and
additional bandwidth.
Failure of the PSTN: 2000 - Patricia Enriquez
FIG: Fault Injection
in glibc -- A Tool for Online Verification of Recovery Mechanisms
- Pete Broadwell
Enhanced software tools are necessary to evaluate the
reliability and recoverability of applications under operating
environment failures. FIG is a lightweight, extensible software
testing package that intercepts calls from applications to the
operating system and injects errors to simulate system faults.
We evaluate the effectiveness of FIG and explore classes of recovery
methods by using FIG to inject faults into several common
UNIX applications.
Hash-History Approach
for Reconciling Mutual Inconsistency in Optimistic Replication
- B. Hoon Kang
Most of the optimistic replication systems are based
on the variant of version-vector mechanism that maintains
the state for each replica sites. The fact that a state
has to be maintained for each site fundamentally limits the scalability
in terms of replica sites and has complexity in managing replica
site creation and retirement. We present a "hash history"
based approach where one does not have to maintain a state for
each site to detect and reconcile conflicts. Our
simulation result shows that hash history is correct and does
scale as the number of site increases and has other interesting
properties. Hash history is capable of capturing the case when
two different series of non-commutative operations produce the
same output independently. Our simulation verified that this
property helps to reduce the conflict rate during anti-entropy
reconciliations.
Mailbench - An e-mail dependability benchmark - Leonard Chung
E-mail was originally conceived as a "best effort"
service with little effort directed towards attaining 100%
dependability. Despite these humble beginnings, e-mail has now become
a mission-critical service. Spurred on by it's increasing ubiquity
and increased expectations for reliability, 100% dependability
is no longer just a desirable feature but rather a virtual requirement
in modern day systems. Given e-mail was never designed with
such a goal in mind, we seek to define metrics of dependability
and have created a dependability benchmark to allow comparison of
e-mail dependability between different e-mail platforms. We present
the structure of the benchmark, possible applications and future
directions for the tool.
Performance of the
Archive - Hakim Weatherspoon,
From a performance standpoint, is archiving realistic.
There are three options for archiving data and each decision has
tradeoffs. First, no archiving at all (tradeoff: no durability).
Second, archiving offline or delayed (tradeoff: time to durability).
Third, archiving inline (response time is reduced, but have higher
durability guarantees).
Platform Modular Redundancy with Virtual Machines
- James Cutler
Platforms include the hardware, operating system, and
middleware used by an application. Traditional fault tolerant
approaches have produced reliable hardware. The application coders
are responsible for the application. But who verifies correct
operation of the OS and other middleware? Two interesting examples
include: 1) There is an applications with a bug, which, when activated
results in different failure modes on different platforms (i.e.
livelock, segmentation fault, silently wrong), 2) The
application is correct but the platforms are faulty. In this
work, we explore the use of virtual machine technology to provide
redundancy and genetic diversity in platform technology. Modular
redundancy and voting is used to detect and mask failures in the
application and platform as described in 1 and 2.
RAINS: Redundant Array of Indepedent, Non-Durable
Stores - Andy Huang, Armando Fox
Some types of state (e.g., shopping cart state) don't
require absolute consistency and durability, but since
there isn't a better alternative, this state is stored in storage
servers that provide absolute guaranteeBrocade: Landmark Routing
on Overlay Networkss (e.g, DBMS, DDS). These guarantees come at
a high cost; they make node failures expensive (operations are
suspended during recovery), make scaling expensive (adding nodes
requires downtime and administration), and limit performance
(coordination incurs overhead). Our hypothesis is that we
can reduce or eliminate these costs by relaxing consistency
and durability. We propose a storage solution that achieves probabilistic
consistency and durability by storing state redundantly in an array
of independent, non-durable state stores. In our initial implementation,
we intend to find out whether usable levels consistency
and durability can be achieved while reducing the costs associated
with node failures, scaling, and coordination overhead.
A Recursively Restartable
iRoom - JP Schnapper-Casteras, George Candea, Armando
Fox
A recursively restartable (RR) system is one capable
of gracefully tolerating restarts at multiple levels; RR
was proposed as a recovery-oriented technique for improving availability
in complex software systems. We have successfully applied RR to
a satellite ground station and improved its recovery time fourfold.
We are now applying RR to the Stanford Interactive Workspaces
Room (iRoom), a multi-node, Windows-based service-rich
environment. The iRoom is a complex system, built from COTS software
as well as custom-written applications; it is a mix of closed
and open source programs, written in several different programming
languages.
Supporting Rapid Mobility via Locality in an Overlay
Network - Ben Zhao
In this poster, we present "Mobile Tapestry", an
extension to the Tapestry overlay network protocol, that enables
scalable, fault-tolerant, and timely delivery of network messages,
including multimedia streams, to and from rapidly moving nodes.
Mobile Tapestry efficiently supports individual mobile nodes and,
by using an approach we call hierarchical mobility, it
also supports large groups of mobile nodes simultaneously
moving together. Mobile Tapestry leverages the Tapestry's
locality mechanisms to reduce the latency and bandwidth of mobility
update traffic, while eliminating the routing inefficiencies and
availability problems of IP-based mobility protocols, such as
Mobile IP. Our simulation results show that Mobile Tapestry significantly
outperforms Mobile IP by providing lower latency and
higher fault tolerance for message delivery.
Tapestry: Motivation
and Algorithms - Kris Hildrum
This poster will have two parts. In the first part,
we will motivate Tapestry by exploring the problem of finding
objects in a distributed network. We will explain why the simple
solutions are not good enough, and compare Tapestry to other less-simple
systems like Chord, CAN, and Pastry. The second part will focus on the algorithms needed
to maintain Tapestry, including node insertion and deletion,
with a special focus on what needs to be done to handle simultaneous
insertions. These results will appear in SPAA 2002.
Towards Building
the OceanStore Web Cache - Patrick Eaton
We have previously presented a cooperative web
caching architecture built on top of OceanStore. We present refinements
to the original design based on experiences building and measuring
the system. We describe further challenges for the system and
ideas for overcoming these issues.
Towards a Theory
for Undo - Aaron Brown
A major tenet of the ROC vision is that service
systems should provide a system-wide Undo mechanism. Undo promotes
a more forgiving environment for the system operator, allowing
effective trial-and-error learning and harnessing the unique human
capacity of hindsight; Undo can be used to recover from human operator
errors, unexpected side-effects of upgrades and reconfigurations,
and even external attacks. Implementing an effective
system-level Undo is rather challenging, requiring extensive mental
gymnastics on the part of the system designer, who has to anticipate
all of the possible "paradoxes" that can arise when system state
is rewound backward in time, altered, then replayed forward again.
To address this complexity, and to provide a framework
for describing and analyzing the behavior of Undoable systems,
we have begun to develop a theory and model for Undo. Our theory
defines a basic architecture for Undoable systems, establishes
a model for tracked operations and undo history, and relates the
existence of external inconsistencies to properties of those
operations and histories. It thus provides a framework
for reasoning about Undo-induced paradoxes, establishes criteria
for the correctness and completeness of an Undo implementation,
and provides guidance in implementing a generalized Undo layer.
Finally, our theory establishes a set of requirements that must
be met if a service system is to be extended with Undo functionality.
Using Introspection to Improve Internet Services - Steve Czerwinski
This poster describes how Internet services can use
introspection to improve their adaptability to changes in
client and network resources. The key idea is to monitor and measure
client actions in order to build models that can be used to predict
costs of future actions. Using this, Internet services can automatically
choose which optimizations should be enabled to improve performance.
These optimizations are built using code and data migration
techniques. This is ongoing research, and the results shown come
from my recent qualifying examination.
Why do Internet
services fail, and what can be done about it? - David Oppenheimer
We studied the causes of failure in three large-scale
Internet services. We found that operator error is the
largest single contributor to service failure (end-user visible
failures). These errors are the most difficult type of failure
to mask, and are generally due to configuration problems. We also
found that front-end software can be a significant cause of service
failures and that back-end failures, while infrequent, take longer
to repair than do front-end failures. Finally, we found
that for at least one of the services, online correctness testing
and better exposing and detecting of component failures (hardware,
software, and human) would have mitigated many of the service failures
that were observed.