Recovery-Oriented Computing (ROC)
Project Summer 2003 Retreat
RADS Breakouts—Developing a New Research Agenda
- Integrating Service/Server/Network Monitoring, Measurement &
- Reliability Benchmarking for Networks, Servers, and Services
- Managing Denial of Service and Service Failures in Systems
- Deploying P2P and Overlay Networks
- Minimizing the Effect of Operator Errors and Misconfigurations in System
- Verifying and Learning Correct Service and Protocol Behaviors
- Network Storage Systems for the Future - Atsushi Ishikawa,
Ryusuke Ito - Hitachi Ltd.
In recent years, the relationship between network technology and storage technology became very close, which has led to a lot of
useful solutions for computer systems. In the first half of this poster (Part I), we clarify users' system
requirements and priorities, as well as trends in network storage technology (e.g. NAS,
iSCSI, overlay networks, IP-VPN, DWDM). We will present a near-future vision of existing storage system
architectures which are categorized by their scale (e.g. LAN/DAS scale, Intranet/SAN scale, WAN/Wide-area
SAN scale). Then in the second half of our poster (Part II), we will focus on "distributed storage
systems" and present a vision and possibilities of them with their assumptive topologies and applications by considering
recent research's merit & demerit.
- Improving Service Availability Measurements - Steve Zhang
We currently base web service availability on the success or failure of individual HTTP requests, where each request has equal weighting in
determining the availability of the service. However, this technique is subject to several pitfalls. Most often, the measured availability is
inflated by numerous requests for images embedded in a webpage. This poster explores several ways to mitigate this effect. This is
a work-in-progress type of poster designed to generate discussion and feedback as we try to
find a good way of measuring the availability of the future ROC-2 platform.
- Latency as a Performability Metric for Internet Services - Pete
Compared with throughput or availability, response time offers a better view of the end-user experience that an interactive service provides during failures.
This ongoing study considers the best ways to record, summarize and examine latency-based measurements in order to improve the reliability of online services.
Of particular interest is how latency measurements interact with other aspects
of the user experience, such as data quality.
- Automating Data Dependability -
Kim Keeton, John Wilkes, HP Labs
Constructing dependable storage systems is difficult, because there are many
techniques to pick from that interact in often unforeseen ways. The resulting
storage systems are often either over-provisioned, or provide inadequate protection, or both. We assert that automating our way out of this dilemma is
both desirable and achievable, and we present some lessons we have learned from
our initial efforts at doing so. The result is a first step down the path of self-managing, dependability-aware storage systems, including a better
understanding of the problem space and its tradeoffs, and a number of insights
that we believe will be helpful to others.
Last Updated: 02/12/2004 09:21