JP
Juan F. Perez
9 records found
1
Authored
Chisel
Reshaping Queries to Trim Latency in Key-Value Stores
It is challenging for key-value data stores to trim user (tail) latency of requests as the workloads are observed to have skewed number of key-value pairs and commonly retrieved via multiget operation, i.e., all keys at the same time. In this paper we present Chisel, a novel clie
...
Today’s big data clusters based on the MapReduce paradigm are capable of executing analysis jobs with multiple priorities, providing differential latency guarantees. Traces from production systems show that the latency advantage of high-priority jobs comes at the cost of severe l
...
Holistic Workload Scaling
A New Approach to Compute Acceleration in the Cloud
Workload scaling is an approach to accelerating computation and thus improving response times by replicating the exact same request multiple times and processing it in parallel on multiple nodes and accepting the result from the first node to finish. This is not unlike a TV game
...
sPARE
Partial Replication for Multi-tier Applications in the Cloud
Offering consistent low latency remains a key challenge for distributed applications, especially when deployed on the cloud where virtual machines (VMs) suffer from capacity variability caused by colocated tenants. Replicating redundant requests were shown to be an effective m ...
Power of redundancy
Designing partial replication for multi-tier applications
Replicating redundant requests has been shown to be an effective mechanism to defend application performance from high capacity variability - the common pitfall in the cloud. While the prior art centers on single-tier systems, it still remains an open question how to design repli
...
Dual Scaling VMs and Queries
Cost-Effective Latency Curtailment
Wimpy virtual instances equipped with small numbers of cores and RAM are popular public and private cloud offerings because of their low cost for hosting applications. The challenge is how to run latency-sensitive applications using such instances, which trade off performance ...
To ensure the scalability of big data analytics, approximate MapReduce platforms emerge to explicitly trade off accuracy for latency. A key step to determine optimal approximation levels is to capture the latency of big data jobs, which is long deemed challenging due to the compl
...
Workload redundancy emerges as an effective method to guarantee quality of service (QoS) targets, especially tail latency, in environments with strong capacity variability such as clouds. Nevertheless mostly single-tier replication strategies have been studied, while multi-tier a
...