A Resiliency-First Approach to Distributed DAG Computations
T.C. Leliveld (TU Delft - Electrical Engineering, Mathematics and Computer Science)
H.Peter Peter Hofstee – Mentor
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
A framework is introduced for computations with transformations on immutable data. Inspiration is taken from Apache Spark, however the model of computation is generalized from an emphasis on narrow and wide dependencies, to an arbitrary set of transformations that form a directed acyclic graph (DAG). A distributed scheduling algorithm is developed with resiliency mechanisms that can account for stopping failure. Furthermore some properties of the system are derived. Finally future work is discussed showing there is fertile ground for further research and development to extend this work.