B. Özkan
Please Note
29 records found
1
Apache Spark is a widely adopted framework for large-scale data processing. However, in industrial analytics environments, Spark's built-in schedulers, such as FIFO and fair scheduling, struggle to maintain both user-level fairness and low mean response time, particularly in long-running shared applications. Existing solutions typically focus on job-level fairness which unintentionally favors users who submit more jobs. Although Spark offers a built-in fair scheduler, it lacks adaptability to dynamic user workloads and may degrade overall job performance. We present the User Weighted Fair Queuing (UWFQ) scheduler, designed to minimize job response times while ensuring equitable resource distribution across users and their respective jobs. UWFQ simulates a virtual fair queuing system and schedules jobs based on their estimated finish times under a bounded fairness model. To further address task skew and reduce priority inversions, which are common in Spark workloads, we introduce runtime partitioning, a method that dynamically refines task granularity based on expected runtime. We implement UWFQ within the Spark framework and evaluate its performance using multi-user synthetic workloads and Google cluster traces. We show that UWFQ reduces the average response time of small jobs by up to 74% compared to existing built-in Spark schedulers and to state-of-the-art fair scheduling algorithms.
In this paper, we propose conflict-free replicated data types (CRDTs) for probabilistic data structures with approximate membership queries. Specifically, we introduce Conflict-Free Replicated Bloom Filters and Conflict-Free Replicated Cuckoo Filters, which are the conflict-free versions of traditional Bloom and Cuckoo filters. We provide implementations of these data structures in an open-source repository and present an evaluation of the approximate query results across various workload and replica synchronization configurations. ...
In this paper, we propose conflict-free replicated data types (CRDTs) for probabilistic data structures with approximate membership queries. Specifically, we introduce Conflict-Free Replicated Bloom Filters and Conflict-Free Replicated Cuckoo Filters, which are the conflict-free versions of traditional Bloom and Cuckoo filters. We provide implementations of these data structures in an open-source repository and present an evaluation of the approximate query results across various workload and replica synchronization configurations.
GRAIL
Checking Transaction Isolation Violations with Graph Queries
Distributed databases are surging in popularity with the growing need for performance and fault tolerance. However, implementing transaction isolation models on distributed databases is more challenging due to their sharding and replication. As a result, they can produce executions that violate their claimed isolation guarantees. In this work, we propose a novel isolation model-agnostic approach that utilizes graph databases to efficiently detect isolation violations expressed as anti-patterns in transactional dependency graphs. To illustrate our approach, we introduce the GRAIL framework, implemented on top of the popular ArangoDB and Neo4j graph databases. GRAIL combines soundness guarantees and high performance with understandable, detailed counter-examples.
The Probabilistic Concurrency Testing (PCT) algorithm that provides theoretical guarantees on the probability of detecting concurrency bugs does not apply to weak memory programs. The PCT algorithm builds on the interleaving semantics of sequential consistency, which does not hold for weak memory concurrency. It is because weak memory concurrency allows additional behaviors that cannot be produced by any interleaving execution. In this paper, we generalize PCT to address weak memory concurrency and present Probabilistic Concurrency Testing for Weak Memory (PCTWM). We empirically evaluate PCTWM on a set of well-known weak memory program benchmarks in comparison to the state-of-the-art weak memory testing tool C11Tester. Our results show that PCTWM can detect concurrency bugs more frequently than C11Tester.
Systematic concurrency testing techniques, which explore all possible delivery orderings of messages to uncover concurrency bugs, are not scalable to large distributed systems such as blockchains.
Random concurrency testing methods search for bugs in a randomly generated set of executions and offer a practical testing method.
In this paper, we investigate the effectiveness of random concurrency testing on blockchain systems using a case study on the XRP Ledger of the Ripple blockchain, which maintains one of the most popular cryptocurrencies in the market today. We test the Ripple consensus algorithm of the XRP Ledger by exploring different delivery orderings of consensus protocol messages.
Moreover, we design an evolutionary algorithm to guide the random test case generation toward certain system behaviors to discover concurrency bugs more efficiently.
Our case study shows that random concurrency testing is effective at detecting concurrency bugs in blockchains, and the evolutionary approach for test generation improves test efficiency.
Our experiments could successfully detect the bugs we seeded in the Ripple source code. Moreover, we discovered a previously unknown concurrency bug in the production implementation of Ripple. ...
Systematic concurrency testing techniques, which explore all possible delivery orderings of messages to uncover concurrency bugs, are not scalable to large distributed systems such as blockchains.
Random concurrency testing methods search for bugs in a randomly generated set of executions and offer a practical testing method.
In this paper, we investigate the effectiveness of random concurrency testing on blockchain systems using a case study on the XRP Ledger of the Ripple blockchain, which maintains one of the most popular cryptocurrencies in the market today. We test the Ripple consensus algorithm of the XRP Ledger by exploring different delivery orderings of consensus protocol messages.
Moreover, we design an evolutionary algorithm to guide the random test case generation toward certain system behaviors to discover concurrency bugs more efficiently.
Our case study shows that random concurrency testing is effective at detecting concurrency bugs in blockchains, and the evolutionary approach for test generation improves test efficiency.
Our experiments could successfully detect the bugs we seeded in the Ripple source code. Moreover, we discovered a previously unknown concurrency bug in the production implementation of Ripple.
Large scale production distributed systems are difficult to design and test. Correctness must be ensured when processes run asynchronously, at arbitrary rates relative to each other, and in the presence of failures, e.g., process crashes or message losses. These conditions create a huge space of executions that is difficult to explore in a principled way. Current testing techniques focus on systematic or randomized exploration of all executions of an implementation while treating the implemented algorithms as black boxes. On the other hand, proofs of correctness of many of the underlying algorithms often exploit semantic properties that reduce reasoning about correctness to a subset of behaviors. For example, the communication-closure property, used in many proofs of distributed consensus algorithms, shows that every asynchronous execution of the algorithm is equivalent to a lossy synchronous execution, thus reducing the burden of proof to only that subset. In a lossy synchronous execution, processes execute in lock-step rounds, and messages are either received in the same round or lost forever-such executions form a small subset of all asynchronous ones. We formulate the communication-closure hypothesis, which states that bugs in implementations of distributed consensus algorithms will already manifest in lossy synchronous executions and present a testing algorithm based on this hypothesis. We prioritize the search space based on a bound on the number of failures in the execution and the rate at which these failures are recovered. We show that a random testing algorithm based on sampling lossy synchronous executions can empirically find a number of bugs-including previously unknown ones-in production distributed systems such as Zookeeper, Cassandra, and Ratis, and also produce more understandable bug traces.
We present a method for verifying whether all executions of a set of transactions satisfy a given invariant when run on weakly consistent systems. Existing approaches check that all executions under weak consistency are equivalent to some serial execution of the transactions, and separately that the serial executions satisfy the invariant. While sound, this can be overly strict. Programs running on systems with weak guarantees are usually designed to tolerate some anomalies w.r.t. the serial semantics and yet maintain some expected program invariants even on executions that are not serializable. In contrast, our technique does not restrict possible executions to be serializable, but directly checks whether given program properties hold w.r.t. all executions allowed under varying consistency models. Our approach uses symbolic execution techniques and satisfiability checkers. We summarize the effects of transactions using symbolic execution and build a satisfiability formula that precisely captures all possible valuations of the data variables under a given consistency model. Then, we check whether the program invariants hold on the resulting symbolic set of behaviors. Our encoding is parameterized over the underlying consistency specification. Hence, the programmer can check the correctness of a program under several consistency models—eventual consistency, causal consistency, (parallel) snapshot isolation, serializability— and identify the level of consistency needed to satisfy the application-level invariants.
Linearizability is a key correctness property for concurrent data types. Linearizability requires that the behavior of concurrently invoked operations of the data type be equivalent to the behavior in an execution where each operation takes effect at an instantaneous point of time between its invocation and return. Given an execution trace of operations, the problem of verifying its linearizability is NP-complete, and current exhaustive search tools scale poorly. In this work, we empirically show that linearizability of an execution trace is often witnessed by a schedule that orders only a small number of operations (the "linearizability depth") in a specific way, independently of other operations. Accordingly, one can structure the search for linearizability witnesses by exploring schedules of low linearizability depth first. We provide such an algorithm. Key to our algorithm is a procedure to generate a strong d-hitting family of schedules, which is guaranteed to cover all linearizability witnesses of depth d. A strong d-hitting family of schedules of an execution trace consists of a set of schedules, such that for each tuple of d operations in the trace, there is a schedule in the family that (i) executes these operations in the order they appear in the tuple, and (ii) as late as possible in the execution. We show that most linearizable execution traces from existing benchmarks can be witnessed by strongly d-hitting schedules for d ≤ 5. Our result suggests a practical and automated method for showing linearizability of a trace based on a prioritization of schedules parameterized by the linearizability depth.
Distributed and concurrent applications often have subtle bugs that only get exposed under specific schedules. While these schedules may be found by systematic model checking techniques, in practice, model checkers do not scale to large systems. On the other hand, naive random exploration techniques often require a very large number of runs to find the specific interactions needed to expose a bug. In recent years, several random testing algorithms have been proposed that, on the one hand, exploit state-space reduction strategies from model checking and, on the other, provide guarantees on the probability of hitting bugs of certain kinds. These existing techniques exploit two orthogonal strategies to reduce the state space: Partial-order reduction and bug depth. Testing algorithms based on partial order techniques, such as RAPOS or POS, ensure nonredundant exploration of independent interleavings among system events by imposing an equivalence relation on schedules and ideally exploring only one schedule from each equivalence class. Techniques based on bug depth, such as PCT, exploit the empirical observation that many bugs are exposed by the clever scheduling of a small number of key events. They bias the sample space of schedules to only cover all executions of small depth, rather than the much larger space of all schedules. At this point, there is no random testing algorithm that combines the power of both approaches. In this paper, we provide such an algorithm. Our algorithm, trace-aware PCT (taPCT), extends and unifies several different algorithms in the random testing literature. It samples the space of low-depth executions by constructing a schedule online, while taking dependencies among events into account. Moreover, the algorithm comes with a theoretical guarantee on the probability of sampling a trace of low depthDthe probability grows exponentially with the depth but only polynomially with the number of racy events explored. We further show that the guarantee is optimal among a large class of techniques. We empirically compare our algorithm with several state-of-the-art random testing approaches for concurrent software on two large-scale distributed systems, Zookeeper and Cassandra, and show that our approach is effective in uncovering subtle bugs and usually outperforms related random testing algorithms.