Application-Oriented Scheduling in Multicluster Grids

Doctoral thesis (2010)

Authors

O.O. Sonmez

Contributors

H.J. Sips (promotor)

Department

Software Technology () (TU Delft)

Performance Scheduling Grid computing Scientific applications Predictions

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:c2189367-acc2-4ca4-9713-5d1fb52e3720

Published Date

07-06-2010

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

ISBN:

9789079982073

Faculty

Electrical Engineering, Mathematics and Computer Science

Department

Software Technology

Abstract

Grid computing appeared in the mid 1990s with the vision of sharing geographically dispersed large computational resources for executing computation-intensive scientific applications. Today, we can name numerous grid projects that run successfully to solve challenging scientific problems such as the grid project of European Organization for Nuclear Research (CERN), which combines thousands of computers worldwide (over 200 sites in about 30 countries) to store and analyze huge amounts of data, which are produced by the Large Hadron Collider (LHC) at CERN. The resources in a grid system are typically heterogeneous since they belong to different administrative domains, and they are managed by proprietary policies. To cope with this heterogeneity, a grid relies on a layer of middleware, which offers transparent access to the distributed resources and simplifies the collaboration between organizations. Grids also need high-level scheduling systems that use grid middleware in order to map application tasks to resources and then manage their execution on behalf of users. However, scheduling in grids is challenging due to the dynamic nature of the grid resources as well as to the lack of control of those resources. The wide variety in the structural and the communication characteristics of the applications submitted to grids further complicate grid scheduling, and may lead to poor or unpredictable performance unless these characteristics are taken into account. In this thesis we address the challenge of designing and analyzing realistic and practical application-oriented scheduling mechanisms in multicluster grid systems. Application-oriented scheduling focuses on the optimization of user-centric performance criteria, such as application execution time, with methods that are specialized for different types of applications. In this thesis we cover a wide-range of grid application types, including parallel applications that may need co-allocation or malleability, bags-of-tasks that can benefit from cycle scavenging, and workflow applications that may push the system to its limits with their computation and data requirements. We investigate the performance of our scheduling mechanisms and policies in a real multicluster grid system, the DAS, using our KOALA multicluster grid scheduler, as well as with simulations using realistic scenarios.

Files

OzanThesis.pdf

(pdf | 4.74 Mb)