[ROC Logo]

Recovery-Oriented Computing

Overview

The Recovery-Oriented Computing (ROC) project is a joint Berkeley/Stanford research project that is investigating novel techniques for building highly-dependable Internet services. In a significant divergence from  traditional fault-tolerance approaches, ROC emphasizes recovery from failures rather than failure-avoidance. This philosophy is motivated by the observation that even the most robust systems still occasionally encounter failures due to human operator error, transient or permanent hardware failure, and software anomalies resulting from "Heisenbugs" or software aging. 

The ROC approach takes the following three assumptions as its basic tenets:

These assumptions, while running counter to most existing work in dependable and fault-tolerant systems, are all strongly supported by field evidence from modern production Internet service environments.

ROC Research Areas

The assumptions listed above provide a broad philosophy for guiding the design of ROC systems. From this philosophy, we have identified several more concrete research areas that fall under the ROC umbrella. Each of these areas defines one of the important qualities that must be provided by a truly recovery-oriented computing system.


Contact: roc-group at cs.berkeley.edu. Last modified on 03-Nov-2004 21:54:22 -0800