- document
-
Ghit, B.I. (author)Data analytics frameworks enable users to process large datasets while hiding the complexity of scaling out their computations on large clusters of thousands of machines. Such frameworks parallelize the computations, distribute the data, and tolerate server failures by deploying their own runtime systems and distributed filesystems on subsets of...doctoral thesis 2017
- document
-
Ghit, B.I. (author), Epema, D.H.J. (author)Providing fault-tolerance is of major importance for data analytics frameworks such as Hadoop and Spark, which are typically deployed in large clusters that are known to experience high failures rates. Unexpected events such as compute node failures are in particular an important challenge for in-memory data analytics frameworks, as the widely...conference paper 2017
- document
-
Ilyushkin, A.S. (author), Ali-Eldin, Ahmed (author), Herbst, Nikolas (author), Papadopoulos, Alessandro (author), Ghit, B.I. (author), Epema, D.H.J. (author), Iosup, A. (author)Simplifying the task of resource management and scheduling for customers, while still delivering complex Quality-of-Service (QoS), is key to cloud computing. Many autoscaling policies have been proposed in the past decade to decide on behalf of cloud customers when and how to provision resources to a cloud application utilizing cloud elasticity...conference paper 2017
- document
-
Ghit, B.I. (author), Epema, D.H.J. (author)Many large-scale data analytics infrastructures are employed for a wide variety of jobs, ranging from short interactive queries to large data analysis jobs that may take hours or even days to complete. As a consequence, data-processing frameworks like MapReduce may have workloads consisting of jobs with heavy-tailed processing requirements. With...conference paper 2016
- document
-
Ali-Eldin, Ahmed (author), Ilyushkin, A.S. (author), Ghit, B.I. (author), Herbst, Nikolas (author), Papadopoulos, Alessandro (author), Iosup, A. (author)Rapid elasticity is one of the essential characteristics of cloud computing identified by NIST. Elasticity allows resources to be provisioned and released to scale rapidly out ward and in ward according to demand. Tens -- if not hundreds -- of algorithms have been proposed in the literature to automatically achieve elastic provisioning. These...conference paper 2016
- document
-
Ghit, B.I. (author), Epema, D.H.J. (author)A well-known problem when executing data-intensive workloads with such frameworks as MapReduce is that small jobs with processing requirements counted in the minutes may suffer from the presence of huge jobs requiring hours or days of compute time, leading to a job slowdown distribution that is very variable and that is uneven across jobs of...conference paper 2015
- document
-
Ilyushkin, A.S. (author), Ghit, B.I. (author), Epema, D.H.J. (author)Workflows are important computational tools in many branches of science, and because of the dependencies among their tasks and their widely different characteristics, scheduling them is a difficult problem. Most research on scheduling workflows has focused on the offline problem of minimizing the makespan of single workflows with known task...conference paper 2015
- document
- Ghit, B.I. (author), Capota, M. (author), Hegeman, T.M. (author), Hidders, A.J.H. (author), Epema, D.H.J. (author), Iosup, A. (author) conference paper 2014