FASTR: Fast Resilience for Stateful Programmable Data Planes
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Programmable data plane devices have enabled various in-network applications that rely on locally stored state for delivering low-latency and high-throughput services. However, these applications are susceptible to network failures, which can disrupt state access and network functionality. Timely and reliable failure detection is therefore a critical component of a stateful data plane. In this paper, we propose a data plane framework, FASTR, that enables microsecond-scale fast failure detection between directly connected switches. FASTR can achieve sub- $10 \mu$ s detection latency by implementing a heartbeat mechanism in the data plane. In addition, FASTR also incorporates traffic-awareness to reduce overhead and priority queuing to avoid false alarms. We validate FASTR with hardware experiments, demonstrating that it can consistently detect failures within $10 \mu$ s using a $4 \mu$ s interval while remaining robust to network congestion.
Files
File under embargo until 22-06-2026