Circular Image

Jérémie Decouchant

info

Please Note

35 records found

Confidential Collaborative Bayesian Networks Inference

Conference paper (2026) - Abele Mălan, Thiago Guzella, Jérémie Decouchant, Lydia Chen
Effective large-scale process optimization in manufacturing industries requires close cooperation between different human expert parties who encode their knowledge of related domains as Bayesian network models. For instance, Bayesian networks for domains such as lithography equipment, processes, and auxiliary tools must be conjointly used to effectively identify process optimizations in the semiconductor industry. However, business confidentiality across domains hinders such collaboration, and encourages alternatives to centralized inference. We propose CCBNet, the first Confidentiality-preserving Collaborative Bayesian Networks inference framework. CCBNet leverages secret sharing to securely perform analysis on the combined knowledge of party models by joining two novel subprotocols: (i) CABN, which augments probability distributions for variables across parties by modeling them into secret shares of their normalized combination; and (ii) SAVE, which aggregates party inference result shares through distributed variable elimination. We extensively evaluate CCBNet via 9 public Bayesian networks. Our results show that CCBNet achieves predictive quality that is similar to the ones of centralized methods while preserving model confidentiality. We further demonstrate that CCBNet scales to challenging manufacturing use cases that involve 16–128 parties in large networks of 223–1003 variables, and decreases, on average, computational overhead by 23%, while communicating 71k values per request. Finally, we showcase possible attacks and mitigations for partially reconstructing party networks in the protocol. ...
Conference paper (2026) - D. Kažemaks, L. Versluis, B. K. Ozkan, J. Decouchant
Apache Spark is a widely adopted framework for large-scale data processing. However, in industrial analytics environments, Spark's built-in schedulers, such as FIFO and fair scheduling, struggle to maintain both user-level fairness and low mean response time, particularly in long-running shared applications. Existing solutions typically focus on job-level fairness which unintentionally favors users who submit more jobs. Although Spark offers a built-in fair scheduler, it lacks adaptability to dynamic user workloads and may degrade overall job performance. We present the User Weighted Fair Queuing (UWFQ) scheduler, designed to minimize job response times while ensuring equitable resource distribution across users and their respective jobs. UWFQ simulates a virtual fair queuing system and schedules jobs based on their estimated finish times under a bounded fairness model. To further address task skew and reduce priority inversions, which are common in Spark workloads, we introduce runtime partitioning, a method that dynamically refines task granularity based on expected runtime. We implement UWFQ within the Spark framework and evaluate its performance using multi-user synthetic workloads and Google cluster traces. We show that UWFQ reduces the average response time of small jobs by up to 74% compared to existing built-in Spark schedulers and to state-of-the-art fair scheduling algorithms. ...
Conference paper (2025) - Rowdy Chotkan, Bart Cox, Vincent Rahli, Jérémie Decouchant
Reliable communication is a fundamental distributed communication abstraction that allows any two nodes within a network to communicate with each other. It is necessary for more powerful communication primitives, such as broadcast and consensus. Using different authentication models, two classical protocols implement reliable communication in unknown and sufficiently connected networks. In the former, network links are authenticated, and processes rely on dissemination paths to authenticate messages. In the latter, processes generate digital signatures that are flooded throughout the network. This work considers the hybrid system model that combines authenticated links and authenticated processes. Additionally, we aim to leverage the possible presence of trusted nodes (e.g., network gateways) and trusted components (e.g., Intel SGX enclaves). We first extend the two classical reliable communication protocols to leverage trusted nodes. Then we propose DualRC, our most generic algorithm that considers the hybrid authentication model by manipulating dissemination paths and digital signatures, and leverages the possible presence of trusted nodes and trusted components. We describe and prove methods that establish whether our algorithms implement reliable communication on a given network. ...
Conference paper (2025) - F. Ezard, C. U. Ileri, J. Decouchant
Following the design of more efficient blockchain consensus algorithms, the execution layer has emerged as the new performance bottleneck of blockchains, especially under high contention. Current parallel execution frameworks either rely on optimistic concurrency control (OCC) or on pessimistic concurrency control (PCC), both of which see their performance decrease when workloads are highly contended, albeit for different reasons. In this work, we present NEMO, a new blockchain execution engine that combines OCC with the object data model to address this challenge. NEMO introduces three core innovations: (i) a greedy commit rule for transactions that do not use shared objects; (ii) refined handling of dependencies to reduce re-executions; and (iii) the use of incomplete but statically derivable read/write hints to guide execution. Through simulated execution experiments, we demonstrate that NEMO significantly reduces redundant computation and achieves higher throughput than representative approaches. For example, with 16 workers nemo's throughput is up to 42% higher than the one of BlockSTM, the state-of-the-art OCC approach, and 61% higher than the pessimistic concurrency control baseline used. ...

Taming Malicious Majorities in Federated Learning using Privacy-preserving Byzantine-robust Clustering

Conference paper (2025) - Rui Wang, Xingkai Wang, Huanhuan Chen, Jérémie Decouchant, Stjepan Picek, Nikolaos Laoutaris, Kaitai Liang
Byzantine-robust Federated Learning (FL) aims to counter malicious clients and train an accurate global model while maintaining an extremely low attack success rate. Most existing systems, however, are only robust when most of the clients are honest. FLTrust (NDSS '21) and Zeno++ (ICML '20) do not make such an honest majority assumption but can only be applied to scenarios where the server is provided with an auxiliary dataset used to filter malicious updates. FLAME (USENIX '22) and EIFFeL (CCS '22) maintain the semi-honest majority assumption to guarantee robustness and the confidentiality of updates. It is, therefore, currently impossible to ensure Byzantine robustness and confidentiality of updates without assuming a semi-honest majority. To tackle this problem, we propose a novel Byzantine-robust and privacy-preserving FL system, called MUDGUARD, to capture malicious minority and majority for server and client sides, respectively. Our experimental results demonstrate that the accuracy of MUDGUARD is practically close to the FL baseline using FedAvg without attacks (≈0.8% gap on average). Meanwhile, the attack success rate is around 0%-5% even under an adaptive attack tailored to MUDGUARD. We further optimize our design by using binary secret sharing and polynomial transformation, leading to communication overhead and runtime decreases of 67%-89.17% and 66.05%-68.75%, respectively. ...
Conference paper (2025) - R. Chotkan, B. Nasrulin, J. Decouchant, J. Pouwelse
Spam poses a growing threat to blockchain networks. Adversaries can easily create multiple accounts to flood transaction pools, inflating fees and degrading service quality. Existing defenses against spam, such as fee markets and staking requirements, primarily rely on economic deterrence, which fails to distinguish between malicious and legitimate users and often exclude low-value but honest activity. To address these shortcomings, we present StarveSpam, a decentralized reputation-based protocol that mitigates spam by operating at the transaction relay layer. StarveSpam combines local behavior tracking, peer scoring, and adaptive rate-limiting to suppress abusive actors, without requiring global consensus, protocol changes, or trusted infrastructure. We evaluate StarveSpam using real Ethereum data from a major NFT spam event and show that it outperforms existing fee-based and rule-based defenses, allowing each node to block over $95 \%$ of spam while dropping just $3 \%$ of honest traffic, and reducing the fraction of the network exposed to spam by $85 \%$ compared to existing rule-based methods. StarveSpam offers a scalable and deployable alternative to traditional spam defenses, paving the way toward more resilient and equitable blockchain infrastructure. ...
Conference paper (2025) - Giulio Segalini, Maria Fernandes, Jérémie Decouchant
Summary statistics are essential to analyse large datasets in various fields, including financial and medical research. Federated computations enhance statistical power by combining geo-distributed datasets while ensuring compliance with data protection regulations, privacy guarantees, and resilience against intrusions. We present Tides, a federated framework leveraging Trusted Execution Environments (TEEs) to defend against adversaries controlling up to f of the N datacenters. We present an instantiation of Tides using genomic (GWAS) statistics. We address TEE-specific attack vectors, including communication blocking and side-channel attacks. Tides follows the following three key steps: (1) TEEs share statistical results through reliable broadcast and run a randomized crash-tolerant binary consensus algorithm to identify the datasets that are available; (2) TEEs enforce differential privacy with ad hoc noise; and (3) TEEs run memory-oblivious algorithms to compute the final summary statistics. We implemented Tides with Intel SGX enclaves and demonstrated its practicality with three datasets. ...
Federated Learning (FL) systems evolve in heterogeneous and ever-evolving environments that challenge their performance. Under real deployments, the learning tasks of clients can also evolve with time, which calls for the integration of methodologies such as Continual Learning (CL). To enable research reproducibility, we propose a set of experimental best practices that precisely capture and emulate complex learning scenarios. To the best of our knowledge, our framework, Freddie, is the first entirely configurable framework for Federated Continual Learning (FCL), and it can be seamlessly deployed on a large number of machines leveraging containerization and Kubernetes. We demonstrate the effectiveness of Freddie on two use cases, (i) large-scale concurrent FL on CIFAR100 and (ii) heterogeneous task sequence on FCL, which highlight unaddressed performance challenges in FCL scenarios. ...
Journal article (2024) - Liangrong Zhao, Jérémie Decouchant, Joseph Liu, Qinghua Lu, Jiangshan Yu
Byzantine Fault Tolerance (BFT) Consensus protocols with trusted hardware assistance have been extensively explored for their improved resilience to tolerate more faulty processes. Nonetheless, the potential of trust hardware has been scarcely investigated in leaderless BFT protocols. RedBelly is assumed to be the first blockchain network whose consensus is based on a truly leaderless BFT algorithm. This paper proposes a trusted hardware-assisted leaderless BFT consensus protocol by offering a hybrid solution for the set BFT problem defined in the RedBelly blockchain. Drawing on previous studies, we present two crucial trusted services: the counter and the collector. Based on these two services, we introduce two primitives to formulate our leaderless BFT protocol: a hybrid verified broadcast (VRB) protocol and a hybrid binary agreement. The hybrid VRB protocol enhances the hybrid reliable broadcast protocol by integrating a verification function. This addition ensures that a broadcast message is verified not only for authentication but also for the correctness of its content. Our hybrid BFT consensus is integrated with these broadcast protocols to deliver binary decisions on all proposals. We prove the correctness of the proposed hybrid protocol and demonstrate its enhanced performance in comparison to the prior trusted BFT protocol. ...

Scalable and decentralized resource orchestration in Fog-IoT environments

Journal article (2024) - Carlos Núñez-Gómez, Martijn de Vos, Jérémie Decouchant, Johan Pouwelse, Blanca Caminero, Carmen Carrión
With the proliferation of Internet of Things (IoT) ecosystems, traditional resource orchestration mechanisms, executed on fog devices, encounter significant scalability, reliability and security challenges. To tackle these challenges, recent decentralized algorithms in Fog-IoT use Distributed Ledger Technologies to orchestrate resources and payments between peers. However, while distributed ledgers provide many desirable properties, their consensus mechanism introduces a performance bottleneck. This paper introduces Light-HIDRA, a consensus-less and decentralized resource orchestration system for Fog-IoT environments. At its core, Light-HIDRA uses Byzantine Reliable Broadcast (BRB) to coordinate actions without centralized control, therefore drastically reducing communication overhead and latency compared to consensus-based solutions. Light-HIDRA coordinates the scheduling and execution of workloads, and securely manages the payments that peers receive for dedicating resources to workloads. Light-HIDRA further increases performance and reduces overhead by grouping peers into distinct domains. We conduct an in-depth analysis of the protocol’s security properties, investigating its efficiency and robustness in diverse situations. We evaluate the performance of Light-HIDRA, highlighting its performance over HIDRA, a state-of-the-art baseline that uses smart contracts. Our experiments demonstrate that Light-HIDRA reduces the bandwidth usage by up to 57x, the latency of workload offloading by up to 142x, and shows superior throughput compared to HIDRA. ...
Conference paper (2024) - Wassim Yahyaoui, Joachim Bruneau-Queyreix, Marcus Völp, Jérémie Decouchant
Geo-replication provides disaster recovery after catastrophic accidental failures or attacks, such as fires, blackouts or denial-of-service attacks to a data center or region. Naturally distributed data structures, such as Blockchains, when well designed, are immune against such disruptions, but they also benefit from leveraging locality. In this work, we consolidate the performance of geo-replicated consensus by leveraging novel insights about hierarchical consensus and a construction methodology that allows creating novel protocols from existing building blocks. In particular we show that cluster confirmation, paired with subgroup rotation, allows protocols to safely operate through situations where all members of the global consensus group are Byzantine. We demonstrate our compositional construction by combining the recent HotStuff and Damysus protocols into a hierarchical geo-replicated blockchain with global durability guarantees. We present a compositionality proof and demonstrate the correctness of our protocol, including its ability to tolerate cluster crashes. Our protocol — Orion1 — achieves a 20% higher throughput than GeoBFT, the latest hierarchical Byzantine Fault-Tolerant (BFT) protocol. ...

Asynchronous Multi-Server Federated Learning for Geo-Distributed Clients

Federated learning (FL) systems enable multiple clients to train a machine learning model iteratively through synchronously exchanging the intermediate model weights with a single server. The scalability of such FL systems can be limited by two factors: server idle time due to synchronous communication and the risk of a single server becoming the bottleneck. In this paper, we propose a new FL architecture, Spyker, the first multi-server FL system that is entirely asynchronous, and therefore addresses these two limitations simultaneously. Spyker keeps both servers and clients continuously active. As in previous multi-server methods, clients interact solely with their nearest server, ensuring efficient update integration into the model. Differently, however, servers also periodically update each other asynchronously, and never postpone interactions with clients. We compare Spyker to three representative baselines - FedAvg, FedAsync and HierFAVG - on the MNIST and CIFAR-10 image classification datasets and on the WikiText-2 language modeling dataset. Spyker converges to similar or higher accuracy levels than previous baselines and requires 61% less time to do so in geo-distributed settings. ...

Taming Malicious Majorities in Federated Learning using Privacy-preserving Byzantine-robust Clustering

Journal article (2024) - Rui Wang, Xingkai Wang, Huanhuan Chen, Jérémie Decouchant, Stjepan Picek, Nikolaos Laoutaris, Kaitai Liang
Byzantine-robust Federated Learning (FL) aims to counter malicious clients and train an accurate global model while maintaining an extremely low attack success rate. Most existing systems, however, are only robust when most of the clients are honest. FLTrust (NDSS '21) and Zeno++ (ICML '20) do not make such an honest majority assumption but can only be applied to scenarios where the server is provided with an auxiliary dataset used to filter malicious updates. FLAME (USENIX '22) and EIFFeL (CCS '22) maintain the semi-honest majority assumption to guarantee robustness and the confidentiality of updates. It is therefore currently impossible to ensure Byzantine robustness and confidentiality of updates without assuming a semi-honest majority. To tackle this problem, we propose a novel Byzantine-robust and privacy-preserving FL system, called MUDGUARD, to capture malicious minority and majority for server and client sides, respectively. Our experimental results demonstrate that the accuracy of MUDGUARD is practically close to the FL baseline using FedAvg without attacks (approximate 0.8% gap on average). Meanwhile, the attack success rate is around 0%-5% even under an adaptive attack tailored to MUDGUARD. We further optimize our design by using binary secret sharing and polynomial transformation leading to communication overhead and runtime decreases of 67%-89.17% and 66.05%-68.75%, respectively. ...
The training of diffusion-based models for image generation is predominantly controlled by a select few Big Tech companies, raising concerns about privacy, copyright, and data authority due to their lack of transparency regarding training data. To ad-dress this issue, we propose a federated diffusion model scheme that enables the independent and collaborative training of diffusion models without exposing local data. Our approach adapts the Federated Averaging (FedAvg) algorithm to train a Denoising Diffusion Model (DDPM). Through a novel utilization of the underlying UNet backbone, we achieve a significant reduction of up to 74% in the number of parameters exchanged during training,compared to the naive FedAvg approach, whilst simultaneously maintaining image quality comparable to the centralized setting, as evaluated by the FID score. ...

An Accountable Mempool for MEV Resistance

Manipulation of user transactions by miners in permissionless blockchain systems is a growing concern. This problem is a pervasive and systemic issue that incurs high costs for users of decentralised applications and is known as Miner Extractable Value (MEV). Furthermore, transaction manipulations create other issues such as congestion, higher fees, and system instability. Detecting transaction manipulations is difficult, even though it is known that they originate from the pre-consensus phase of transaction selection for building blocks, at the base layer of blockchain protocols. In this paper, we summarize known transaction manipulation attacks. We present LO, an accountable base layer protocol designed to detect and mitigate transaction manipulations. LO is built around the accurate detection of transaction manipulations and assignment of blame at the granularity of a single mining node. LO forces miners to log all the transactions they receive into a secure mempool data structure and to process them in a verifiable manner. Overall, LO quickly and efficiently detects censorship, injection or re-ordering attempts. Our performance evaluation shows that LO is also practical and only introduces a marginal performance overhead. ...
Byzantine consensus protocols aim at maintaining safety guarantees under any network synchrony model and at providing liveness in partially or fully synchronous networks. However, several Byzantine consensus protocols have been shown to violate liveness properties under certain scenarios. Existing testing methods for checking the liveness of consensus protocols check for time-bounded liveness violations, which generate a large number of false positives. In this work, for the first time, we check the liveness of Byzantine consensus protocols using the temperature and lasso detection methods, which require the definition of ad-hoc system state abstractions. We focus on the HotStuff protocol family that has been recently developed for blockchain consensus. In this family, the HotStuff protocol is both safe and live under the partial synchrony assumption, while the 2-Phase Hotstuff and Sync HotStuff protocols are known to violate liveness in subtle fault scenarios. We implemented our liveness checking methods on top of the Twins automated unit test generator to test the HotStuff protocol family. Our results indicate that our methods successfully detect all known liveness violations and produce fewer false positives than the traditional time-bounded liveness checks. ...
Conference paper (2023) - Tong Cao, J.E.A.P. Decouchant, Jiangshan Yu
We describe and analyze perishing mining, a novel blockwithholding mining strategy that lures profit-driven miners away from doing useful work on the public chain by releasing block headers from a privately maintained chain. We then introduce the dual private chain (DPC) attack, where an adversary that aims at double spending increases its success rate by intermittently dedicating part of its hash power to perishing mining. We detail the DPC attack's Markov decision process, evaluate its double spending success rate using Monte Carlo simulations. We show that the DPC attack lowers Bitcoin's security bound in the presence of profit-driven miners that do not wait to validate the transactions of a block before mining on it.
...
Conference paper (2023) - Ali Shoker, Vincent Rahli, Jérémie Decouchant, Paulo Esteves-Veríssimo
Current vehicular Intrusion Detection and Prevention Systems either incur high false-positive rates or do not capture zero-day vulnerabilities, leading to safety-critical risks. In addition, prevention is limited to few primitive options like dropping network packets or extreme options, e.g., ECU Bus-off state. To fill this gap, we introduce the concept of vehicular Intrusion Resilience Systems (IRS) that ensures the resilience of critical applications despite assumed faults or zero-day attacks, as long as threat assumptions are met. IRS enables running a vehicular application in a replicated way, i.e., as a Replicated State Machine, over several ECUs, and then requiring the replicated processes to reach a form of Byzantine agreement before changing their local state. Our study rides the mutation of modern vehicular environments, which are closing the gap between simple and resource-constrained "real-time and embedded systems", and complex and powerful "information technology" ones. It shows that current vehicle (e.g., Zonal) architectures and networks are becoming plausible for such modular fault and intrusion tolerance solutions—deemed too heavy in the past. Our evaluation on a simulated Automotive Ethernet network running two state-of-the-art agreement protocols (Damysus and Hotstuff) shows that the achieved latency and throughout are feasible for many Automotive applications. ...
Conference paper (2023) - J. Xu, C. Hong, J. Huang, Lydia Y. Chen, J.E.A.P. Decouchant
Federated learning is a private-by-design distributed learning paradigm where clients train local models on their own data before a central server aggregates their local updates to compute a global model. Depending on the aggregation method used, the local updates are either the gradients or the weights of local learning models, e.g., FedAvg aggregates model weights. Unfortunately, recent reconstruction attacks apply a gradient inversion optimization on the gradient update of a single mini- batch to reconstruct the private data used by clients during training. As the state-of-the-art reconstruction attacks solely focus on single update, realistic adversarial scenarios are over- looked, such as observation across multiple updates and updates trained from multiple mini-batches. A few studies consider a more challenging adversarial scenario where only model updates based on multiple mini-batches are observable, and resort to computationally expensive simulation to untangle the underlying samples for each local step. In this paper, we propose AGIC, a novel Approximate Gradient Inversion Attack that efficiently and effectively reconstructs images from both model or gradient updates, and across multiple epochs. In a nutshell, AGIC (i) approximates gradient updates of used training samples from model updates to avoid costly simulation procedures, (ii) leverages gradient/model updates collected from multiple epochs, and (iii) assigns increasing weights to layers with respect to the neural network structure for reconstruction quality. We extensively evaluate AGIC on three datasets, namely CIFAR-10, CIFAR- 100 and ImageNet. Our results show that AGIC increases the peak signal-to-noise ratio (PSNR) by up to 50% compared to two representative state-of-the-art gradient inversion attacks. Furthermore, AGIC is faster than the state-of-the-art simulation- based attack, e.g., it is 5x faster when attacking FedAvg with 8 local steps in between model updates. ...
Book chapter (2023) - Maria Fernandes, Jérémie Decouchant, Francisco M. Couto
DNA computing is an emerging field that aims at enabling more efficient data storage and processing. One principle of DNA computing is to encode some information (e.g., image, video, programming scripts) into a digital DNA-like sequence and then synthesize the corresponding DNA molecule. Synthesizing this molecule using digital or real human genomic fragments theoretically opens the possibility for privacy attacks, which have been demonstrated on a large array of human genomic data. These privacy attacks aim at breaching the privacy of DNA samples, allowing an attacker to discover privacy-critical information from the partial or complete DNA information of an individual. In the context of DNA computing, novel privacy attacks will certainly emerge and could consist in discovering a part of a particular script or video that is privacy-critical. It is therefore important to consider whether privacy attacks and defense mechanisms can be used when manipulating genomic data. First, this chapter provides the background about genomic data, and its modern generation and processing. It then provides a survey on known genomic privacy attacks, and presents the privacy-enhancing technologies that have been designed to protect genomic data. Later, this chapter also introduces the current trust management methods one can rely on to further secure DNA storage and processing methods, before discussing how DNA computing currently relates to those attacks and privacy-preserving technologies. Finally, this chapter presents future research avenues. ...