Scheduling the Spark Framework under the Mesos Resource Manager

More Info


Using clusters of servers and datacenters to process large numbers of data- and computation-intensive jobs is becoming mainstream. The need for large clusters is driven by the fact that many workloads are growing at a faster rate than the advances in single computer performance. To manage processing in a cluster of multiple computers, several frameworks have appeared over the years. These frameworks supply the user with convenient computation constructs and abstract from the low-level implementations to relieve the burden of inter-process communication and task placement in a cluster. However, these frameworks often assume total control over a static partition of the cluster, leading to under-utilization in times when one framework is over-committed with work, whereas another framework is idling. To overcome this under-utilization, and multiplex frameworks in clusters, cluster schedulers have been proposed, which sit on top of the hardware resources and schedule hardware resource leases to frameworks. It is not well described how these systems differ from each other. Furthermore, how to achieve \emph{performance balance} between frameworks, such that multiple frameworks achieve similar performance metrics when running time-varying workloads is relatively unexplored. We define a taxonomy which describes the combination of cluster schedulers and frameworks, called Two-Level schedulers in general. We characterize the multiple state-of-the-art cluster schedulers that are described in the literature or are used in practice. We distinguish multiple aspects which define the cluster schedulers. These aspects can dictate how frameworks interface with the cluster scheduler and can also influence framework performance. We aim to achieve performance balance for multiple \gls{spark} frameworks running under the \gls{mesos} cluster scheduler. First, we evaluate the performance of a single framework running single interactive data analytics queries. We find multiple configuration parameters which influence the performance for interactive queries. However, we conclude that using \gls{sparkmesos} for interactive queries results in either inefficient use of resources, or does not allow us to multiplex resources over multiple frameworks. We continue with achieving performance balance for multiple frameworks for non-interactive queries. We first establish a baseline performance balance, which we attain by using knowledge of the possibly different workload intensities run by the frameworks in a real cluster. Afterwards, we achieve similar performance balance, compared to the baseline, for up to three frameworks without knowing the workload intensity a priori. This is achieved by using a feedback loop controller, which updates resource share sizes allocated to frameworks dynamically, based on the online performance metrics of the frameworks.