Mitigating Cold Start Problem in Microservice Auto-scaling on Kubernetes

More Info
expand_more

Abstract

With the advent of the cloud-native paradigm, the software development and deployment style has significantly reformed. An increasing number of enterprises are migrating their microservice applications onto Kubernetes, a production-grade container orchestration platform, to fully use the advantages of Kubernetes in self-healing, service discovery, auto-scaling, etc. Kubernetes introduces a host of novel software design ideas but also poses challenges to newcomers. In this thesis, we focus on the cold start problem in microservice auto-scaling on Kubernetes. A cold start refers to a situation where services receive overwhelming requests from the outside but has no sufficient backend containers to serve. As the auto-scaling process is launching slow, the cold start problem may elongate response latency or even cause service unavailable in a more severe case. To study and mitigate the adverse effect of cold start, we pose three research questions.

The first research question is to identify what factors contribute to the cold start. We first learn lessons from research on cold start in Serverless and examine the current service auto-scaling process powered by Horizontal Pod Autoscaler (HPA), then suggest five hypothetical factors that may affect cold-start performance. Our findings indicate that reactive autoscalers like HPA are dilatory in determining scale events and lack a coordination mechanism to scale correlated workloads in sync. Besides, different programming languages differ in code loading and execution time. The major factors lie in slow scale decision making and application code loading. In contrast, other factors such as cluster size, the choice of container runtimes and CNI network solutions are negligible.

In the next two research questions, we investigate how to mitigate cold start and how effective the proposed solutions are. Based on our study from the first research question, we conclude that extra efforts must be made to speed up scaling decision-making and reduce application startup time. We compare a few technical solutions to the hypothetical factors. According to the results, we encourage microservice application developers, especially Java developers, to try and test different programming language stacks, preferably Go, for rapid auto-scaling. In addition, we propose a coordinated horizontal pod autoscaler, named CHPA, a supplement to HPA, allowing a group of identified, correlated scale targets to scale out in sync, rather than each making independent scaling decisions. As a result, no laggard pod in the auto-scaling process, thereby mitigating the cold start problem holistically.

Files

License info not available