C3: Cutting Tail Latency in Cloud Data Stores via Adaptive Replica Selection

C3: Cutting Tail Latency in Cloud Data Stores via Adaptive Replica Selection

Abstract

Achieving predictable performance is critical for many distributed applications, yet difficult to achieve due to many factors that skew the tail of the latency distribution even in well-provisioned systems. In this paper, we present the fundamental challenges involved in designing a replica selection scheme that is robust in the face of performance fluctuations across servers. We illustrate these challenges through performance evaluations of the Cassandra distributed database on Amazon EC2. We then present the design and implementation of an adaptive replica selection mechanism, C3, that is robust to performance variability in the environment. We demonstrate C3’s effectiveness in reducing the latency tail and improving throughput through extensive evaluations on Amazon EC2 and through simulations. Our results show that C3 significantly improves the latencies along the mean, median, and tail (up to 3 times improvement at the 99.9 th percentile) and provides higher system throughput

Grafik Top
Authors
  • Suresh, Lalith
  • Canini, Marco
  • Schmid, Stefan
  • Feldmann, Anja
Grafik Top
Supplemental Material
Shortfacts
Category
Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title
12th USENIX Symposium on Networked Systems Design and Implementation (NSDI)
Divisions
Communication Technologies
Subjects
Informatik Allgemeines
Event Location
Oakland, California, USA
Event Type
Conference
Event Dates
May 2015
Date
2015
Export
Grafik Top