Supporting Tail Latency SLOs in Ceph

Open Access
- Author:
- Guan, Shaobo
- Area of Honors:
- Computer Science
- Degree:
- Bachelor of Science
- Document Type:
- Thesis
- Thesis Supervisors:
- Timothy Zhu, Thesis Supervisor
Ting He, Thesis Honors Advisor - Keywords:
- Tail latency SLOs
Distributed storage
Ceph
Scheduling - Abstract:
- When optimizing the performance of a storage system, people often focus on reducing the average latency. Nevertheless, when the behavior of a workload exhibits burstiness, a traditional server that optimizes the average latency potentially generates a high tail latency (i.e., the time the server takes to serve some of the slowest requests). To ensure consistent performance across almost all users, companies often desire a low and bounded tail latency. The objective of the thesis is to design a new system for meeting tail latency performance goals, known as Service Level Objectives (SLOs), in the context of distributed storage. Prior work has demonstrated that analytical modeling approaches can help in configuring systems to meet tail latency SLOs in networks, hard drives, and SSDs, but these approaches have not been applied yet in distributed storage. These approaches are hard to implement in distributed storage because clients communicate with many servers, which leads to unpredictable interference patterns. This thesis takes a first step in analytically modeling distributed storage for meeting tail latency SLOs. To evaluate our distributed storage model, we use Ceph, an open-source distributed system, as the proving ground. Our preliminary results show that our model can integrate with prior work to meet tail latency SLOs in distributed storage while efficiently using the storage resources.