Elastic scaling of P4 network functions

More Info
expand_more

Abstract

As recently emerged network concepts such as Network Function Virtualization (NFV) and Software-Defined Networking (SDN) promise to bring more flexibility to existing networks they also pave the road for the development of a new type of service. These novel services are known as mission-critical services which generally have tight latency and jitter restrictions. Some examples of this type of service are could gaming, cloud-connected virtual reality, or remote surgery. To avoid exceeding the tight tolerances set by these services, Network Functions (NFs) that are necessary for the network connection (e.g. firewalls), should benefit from both hardware acceleration for increased performance and scaling to reduce the hardware footprint and introduce more network flexibility. In this thesis, a novel, horizontal scaling solution for Virtualized Network Functions (VNFs) was designed and implemented in P4. In particular, we propose a novel elastic scaling solution for hardware-accelerated switches. Our solution consists of three parts: flow migration, monitoring, and decision-making. Flow migration, the part where flows are migrated to another switch, is performed in three phases. First, the migration source migrates the state to the migration destination. Then, the migration source updates the NF states as they change on the migration source. When the controller completes its tasks, flow packets are forwarded to the migration destination and the controller can divert the traffic to the migration destination directly. Our decision-making algorithm uses VNF, switch, and individual flow usage statistics to decide where to scale to and if applicable where to place a new NF instance. To reduce the monitoring overhead, only usage statistics for high-rate flows are gathered and overloads are detected within the switch. Our decision-making algorithm is able to spawn new NF instances when an overload occurs or redistribute load among already existing NF instances to optimize NF resource usage. Results show that our algorithm reduces migration times by 0.52s on average while the average latency observed by the flow is reduced by approximately 5ms on average when compared to current state-of-the-art. Furthermore, our solution does not overflow the controller as NF states are communicated directly between switches which are equipped with links that are optimized for high data volumes.

Files