Jérémie Decouchant
Please Note
35 records found
1
CCBNet
Confidential Collaborative Bayesian Networks Inference
Effective large-scale process optimization in manufacturing industries requires close cooperation between different human expert parties who encode their knowledge of related domains as Bayesian network models. For instance, Bayesian networks for domains such as lithography equipment, processes, and auxiliary tools must be conjointly used to effectively identify process optimizations in the semiconductor industry. However, business confidentiality across domains hinders such collaboration, and encourages alternatives to centralized inference. We propose CCBNet, the first Confidentiality-preserving Collaborative Bayesian Networks inference framework. CCBNet leverages secret sharing to securely perform analysis on the combined knowledge of party models by joining two novel subprotocols: (i) CABN, which augments probability distributions for variables across parties by modeling them into secret shares of their normalized combination; and (ii) SAVE, which aggregates party inference result shares through distributed variable elimination. We extensively evaluate CCBNet via 9 public Bayesian networks. Our results show that CCBNet achieves predictive quality that is similar to the ones of centralized methods while preserving model confidentiality. We further demonstrate that CCBNet scales to challenging manufacturing use cases that involve 16–128 parties in large networks of 223–1003 variables, and decreases, on average, computational overhead by 23%, while communicating 71k values per request. Finally, we showcase possible attacks and mitigations for partially reconstructing party networks in the protocol.
Apache Spark is a widely adopted framework for large-scale data processing. However, in industrial analytics environments, Spark's built-in schedulers, such as FIFO and fair scheduling, struggle to maintain both user-level fairness and low mean response time, particularly in long-running shared applications. Existing solutions typically focus on job-level fairness which unintentionally favors users who submit more jobs. Although Spark offers a built-in fair scheduler, it lacks adaptability to dynamic user workloads and may degrade overall job performance. We present the User Weighted Fair Queuing (UWFQ) scheduler, designed to minimize job response times while ensuring equitable resource distribution across users and their respective jobs. UWFQ simulates a virtual fair queuing system and schedules jobs based on their estimated finish times under a bounded fairness model. To further address task skew and reduce priority inversions, which are common in Spark workloads, we introduce runtime partitioning, a method that dynamically refines task granularity based on expected runtime. We implement UWFQ within the Spark framework and evaluate its performance using multi-user synthetic workloads and Google cluster traces. We show that UWFQ reduces the average response time of small jobs by up to 74% compared to existing built-in Spark schedulers and to state-of-the-art fair scheduling algorithms.
Following the design of more efficient blockchain consensus algorithms, the execution layer has emerged as the new performance bottleneck of blockchains, especially under high contention. Current parallel execution frameworks either rely on optimistic concurrency control (OCC) or on pessimistic concurrency control (PCC), both of which see their performance decrease when workloads are highly contended, albeit for different reasons. In this work, we present NEMO, a new blockchain execution engine that combines OCC with the object data model to address this challenge. NEMO introduces three core innovations: (i) a greedy commit rule for transactions that do not use shared objects; (ii) refined handling of dependencies to reduce re-executions; and (iii) the use of incomplete but statically derivable read/write hints to guide execution. Through simulated execution experiments, we demonstrate that NEMO significantly reduces redundant computation and achieves higher throughput than representative approaches. For example, with 16 workers nemo's throughput is up to 42% higher than the one of BlockSTM, the state-of-the-art OCC approach, and 61% higher than the pessimistic concurrency control baseline used.
MUDGUARD
Taming Malicious Majorities in Federated Learning using Privacy-preserving Byzantine-robust Clustering
LIGHT-HIDRA
Scalable and decentralized resource orchestration in Fog-IoT environments
Spyker
Asynchronous Multi-Server Federated Learning for Geo-Distributed Clients
Federated learning (FL) systems enable multiple clients to train a machine learning model iteratively through synchronously exchanging the intermediate model weights with a single server. The scalability of such FL systems can be limited by two factors: server idle time due to synchronous communication and the risk of a single server becoming the bottleneck. In this paper, we propose a new FL architecture, Spyker, the first multi-server FL system that is entirely asynchronous, and therefore addresses these two limitations simultaneously. Spyker keeps both servers and clients continuously active. As in previous multi-server methods, clients interact solely with their nearest server, ensuring efficient update integration into the model. Differently, however, servers also periodically update each other asynchronously, and never postpone interactions with clients. We compare Spyker to three representative baselines - FedAvg, FedAsync and HierFAVG - on the MNIST and CIFAR-10 image classification datasets and on the WikiText-2 language modeling dataset. Spyker converges to similar or higher accuracy levels than previous baselines and requires 61% less time to do so in geo-distributed settings.
MUDGUARD
Taming Malicious Majorities in Federated Learning using Privacy-preserving Byzantine-robust Clustering
Byzantine-robust Federated Learning (FL) aims to counter malicious clients and train an accurate global model while maintaining an extremely low attack success rate. Most existing systems, however, are only robust when most of the clients are honest. FLTrust (NDSS '21) and Zeno++ (ICML '20) do not make such an honest majority assumption but can only be applied to scenarios where the server is provided with an auxiliary dataset used to filter malicious updates. FLAME (USENIX '22) and EIFFeL (CCS '22) maintain the semi-honest majority assumption to guarantee robustness and the confidentiality of updates. It is therefore currently impossible to ensure Byzantine robustness and confidentiality of updates without assuming a semi-honest majority. To tackle this problem, we propose a novel Byzantine-robust and privacy-preserving FL system, called MUDGUARD, to capture malicious minority and majority for server and client sides, respectively. Our experimental results demonstrate that the accuracy of MUDGUARD is practically close to the FL baseline using FedAvg without attacks (approximate 0.8% gap on average). Meanwhile, the attack success rate is around 0%-5% even under an adaptive attack tailored to MUDGUARD. We further optimize our design by using binary secret sharing and polynomial transformation leading to communication overhead and runtime decreases of 67%-89.17% and 66.05%-68.75%, respectively.
LO
An Accountable Mempool for MEV Resistance
Manipulation of user transactions by miners in permissionless blockchain systems is a growing concern. This problem is a pervasive and systemic issue that incurs high costs for users of decentralised applications and is known as Miner Extractable Value (MEV). Furthermore, transaction manipulations create other issues such as congestion, higher fees, and system instability. Detecting transaction manipulations is difficult, even though it is known that they originate from the pre-consensus phase of transaction selection for building blocks, at the base layer of blockchain protocols. In this paper, we summarize known transaction manipulation attacks. We present LO, an accountable base layer protocol designed to detect and mitigate transaction manipulations. LO is built around the accurate detection of transaction manipulations and assignment of blame at the granularity of a single mining node. LO forces miners to log all the transactions they receive into a secure mempool data structure and to process them in a verifiable manner. Overall, LO quickly and efficiently detects censorship, injection or re-ordering attempts. Our performance evaluation shows that LO is also practical and only introduces a marginal performance overhead.
...
DNA computing is an emerging field that aims at enabling more efficient data storage and processing. One principle of DNA computing is to encode some information (e.g., image, video, programming scripts) into a digital DNA-like sequence and then synthesize the corresponding DNA molecule. Synthesizing this molecule using digital or real human genomic fragments theoretically opens the possibility for privacy attacks, which have been demonstrated on a large array of human genomic data. These privacy attacks aim at breaching the privacy of DNA samples, allowing an attacker to discover privacy-critical information from the partial or complete DNA information of an individual. In the context of DNA computing, novel privacy attacks will certainly emerge and could consist in discovering a part of a particular script or video that is privacy-critical. It is therefore important to consider whether privacy attacks and defense mechanisms can be used when manipulating genomic data. First, this chapter provides the background about genomic data, and its modern generation and processing. It then provides a survey on known genomic privacy attacks, and presents the privacy-enhancing technologies that have been designed to protect genomic data. Later, this chapter also introduces the current trust management methods one can rely on to further secure DNA storage and processing methods, before discussing how DNA computing currently relates to those attacks and privacy-preserving technologies. Finally, this chapter presents future research avenues.