Application-Oriented Scheduling in Multicluster Grids

More Info
expand_more

Abstract

Grid computing appeared in the mid 1990s with the vision of sharing geographically dispersed large computational resources for executing computation-intensive scientific applications. Today, we can name numerous grid projects that run successfully to solve challenging scientific problems such as the grid project of European Organization for Nuclear Research (CERN), which combines thousands of computers worldwide (over 200 sites in about 30 countries) to store and analyze huge amounts of data, which are produced by the Large Hadron Collider (LHC) at CERN. The resources in a grid system are typically heterogeneous since they belong to different administrative domains, and they are managed by proprietary policies. To cope with this heterogeneity, a grid relies on a layer of middleware, which offers transparent access to the distributed resources and simplifies the collaboration between organizations. Grids also need high-level scheduling systems that use grid middleware in order to map application tasks to resources and then manage their execution on behalf of users. However, scheduling in grids is challenging due to the dynamic nature of the grid resources as well as to the lack of control of those resources. The wide variety in the structural and the communication characteristics of the applications submitted to grids further complicate grid scheduling, and may lead to poor or unpredictable performance unless these characteristics are taken into account. In this thesis we address the challenge of designing and analyzing realistic and practical application-oriented scheduling mechanisms in multicluster grid systems. Application-oriented scheduling focuses on the optimization of user-centric performance criteria, such as application execution time, with methods that are specialized for different types of applications. In this thesis we cover a wide-range of grid application types, including parallel applications that may need co-allocation or malleability, bags-of-tasks that can benefit from cycle scavenging, and workflow applications that may push the system to its limits with their computation and data requirements. We investigate the performance of our scheduling mechanisms and policies in a real multicluster grid system, the DAS, using our KOALA multicluster grid scheduler, as well as with simulations using realistic scenarios.

Files