Internet Systems and Storage Group Software architectures for scalable wide-area systems Duke Computer Science
	Home \| Members \| Publications \| Internal

TACT
(Tunable Availability and Consistency Tradeoffs)

An ultimate goal for modern Internet services is the development of scalable, high-performance, highly-available and fault-tolerant systems. Replication (caching as a special case) is a key approach to achieve this goal. However, because of wide-area latency, potential Internet congestion/failures and the inherent overhead associated with strong consistency, a naive replication system may actually decrease rather than increase performance and availability. Given such duality, the practical use of replication in the Internet has been much hindered by the lack of appropriate models and realistic experiences. We believe that WAN applications can benefit much more from replication than using today's replication techniques. In particular, we argue that Internet services can benefit from dynamically choosing consistency/performance and consistency/availability tradeoffs, based on client, service and network characteristics. Within this context, our TACT project has made the following contributions:

Conit-based Continuous Consistency Model

In today's world, system consistency is typically defined as a switch. Either the system provides strong consistency with reduced performance/availability, or the system provides weaker consistency (e.g., through optimistic replication) with a higher level of system performance/availability.

However, for many WAN applications, consistency is continuous rather than binary. For example, for better performance and availability, an airline reservation system may be willing to tolerate 1% reservation conflicting rate. In general, specifying higher consistency will result in lower performance/availability. One desirable feature of TACT is the ability to dynamically trade consistency for availability (and performance) in response to current system, network, and client characteristics. For example, requests from preferred clients may be guaranteed higher consistency. As another example, an airline reservation system may gradually increase consistency as the number of available seats on a flight decreases.

The first step in TACT is to conceptually turn the binary consistency switch into a knob. However, consistency semantics are highly application-dependent. Thus, in the consistency model, we must allow the application to export its specific semantics and quantify its own consistency. The concept of conit serves as a carrier for such application consistency semantics and is key to our continuous consistency model. Conceptually, a conit is a logical consistency unit. Different from most consistency models, applications impose consistency requirements on conits rather than physical data items in our model. The consistency level of each conit is define using a simple, spanning-set of metrics: Numerical Error, Order Error and Staleness. Our paper in ICDCS'01 concentrates on our abstract conit-based continuous consistency model and its generality and practicality. The journal version (TOCS'02) of the paper discusses the model more deeply.

Design, Implementation and Performance Evaluation of the TACT Toolkit

To demonstrate the feasibility of our model and evaluate its performance, we implemented the TACT toolkit for WAN replication and three sample applications (Airline Reservation, Bulletin Board and QoS Load Distribution) using the toolkit. First, We designed a family of protocols to enforce numerical error bound (see our VLDB2000 paper). Order error and staleness are bound using modified versions of existing protocols. We deployed the prototype in Berkeley, Utah and Duke and evaluated its performance. The system architecture of the prototype and performance results are available in our OSDI2000 paper.

Availability Upper Bounds vs. Achieved Availability under Continuous Consistency

To fully understand the effects of continuous consistency on system availability, we deploy our prototype on eight WAN sites and measure the availability of the bulletin board application through a multi-day period. Using emulation, we are able to inject faultloads into the system and measure the achieved availability of various consistency protocols under various replication and failure conditions. We also theoretically derive tight upper bound on system availability as a function of consistency, workload and faultload. Such upper bounds allow us to argue about the absolute merits of various consistency protocols. These results are published in SOSP'01.

Optimal Replica Placement For Availability

We have extended our availability upper bound theory so that it can compute the optimal replica placement for best system availability given a particular consistency level. The new theory also allows replicas to be disabled to allow progress in other partitions. The complete theory is available in PODC'02.

More detail is available from the following:

Haifeng Yu, " Roadmap of TACT publications ." Read this first if you do not know which TACT paper you are looking for.

Haifeng Yu and Amin Vahdat, "Minimal Replication Cost for Availability." Proceedings of the 21st ACM Symposium on Principles of Distributed Computing (PODC), July 2002. [PDF]. A longer version available as Technical Report CS-2002-04. [PDF].

Haifeng Yu and Amin Vahdat, "Design and Evaluation of a Conit-based Continuous Consistency Model for Replicated Services. " To appear in ACM Transactions on Computer Systems (TOCS). [Draft Version].

Haifeng Yu and Amin Vahdat, "The Costs and Limits of Availability for Replicated Services." To appear in the Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP), October 2001. [PDF]

Haifeng Yu and Amin Vahdat, "Combining Generality and Practicality in a Conit-Based Continuous Consistency Model for Wide-Area Replication." Proceedings of the 21st International Conference on Distributed Computing Systems (ICDCS), April 2001. [Postscript], [PDF]. A longer version available as Technical Report CS-2000-09. [Postscript], [PDF].

Haifeng Yu and Amin Vahdat, "Using TACT to Build Replicated Network Services." June 2000. [Presentation Slides]

Haifeng Yu and Amin Vahdat, "Design and Evaluation of a Continuous Consistency Model for Replicated Services." Proceedings of the Fourth Symposium on Operating Systems Design and Implementation (OSDI), October 2000. [Postscript], [PDF].

Haifeng Yu and Amin Vahdat, "Efficient Numerical Error Bounding for Replicated Network Services." 26th International Conference on Very Large Databases (VLDB), September 2000. [Postscript], [PDF]. A longer version available as Technical Report CS-2000-08. [Postscript], [PDF].

Haifeng Yu and Amin Vahdat, "Building Replicated Internet Services Using TACT: A Toolkit for Tunable Availability and Consistency Tradeoffs." Second International Workshop on Advanced Issues of E-Commerce and Web-based Information Systems, June 2000.

Haifeng Yu and Amin Vahdat, "TACT: Tunable Availability and Consistency Tradeoffs for Replicated Internet Services." SOSP Work in Progress Presentation, December 1999. [PDF]

Updated June, 2003.

TACT (Tunable Availability and Consistency Tradeoffs)

Conit-based Continuous Consistency Model

Design, Implementation and Performance Evaluation of the TACT Toolkit

Availability Upper Bounds vs. Achieved Availability under Continuous Consistency

Optimal Replica Placement For Availability

TACT
(Tunable Availability and Consistency Tradeoffs)