Author
Steve J. Chapin, Dimitrios Katramatos, John Karpovich, Andrew Grimshaw
Abstract
The recent development of gigabit networking technology, combined with the proliferation of low-cost, high-performance microprocessors, has given rise to metacomputing environments. These environments can combine many thousands of hosts, from hundreds of administrative domains, connected by transnational and worldwide networks. Managing the resources in such a system is a complex task, but is necessary to efficiently and economically execute user programs.
In this paper, we describe the resource managment portions of the Legion metacomputing system, including the basic model and its implementation. These mechanisms are flexible both in their support for system-level resource management but also in their adaptability for user-level scheduling policies. We show this by implementing a simple scheduling policy and demonstrating how it can be adapted to more complex algorithms.
Keywords
parallel and distributed systems, task scheduling, resource management, autonomy