Skip title Accessibility statement: we seek to make the HPC web pages accessible to all users. If you encounter accessibility issues with HPC web pages please send a description of the problem by email to eric_sills@ncsu.edu - thank you.

High Performance and Grid Computing
   
Skip menu side bar
Home
About

OpNews

Help/Accounts

Staff

Partners

User Projects


Services

Hardware

Software

Grid

Monitor


HowTo/FAQ

Docs & Pubs

Courses

Other Resources

    LSF on HPC Cluster

    Platform Computing's Load Sharing Facility (LSF) is used for job queuing and scheduling on the HPC cluster compute nodes. All jobs run on the cluster compute nodes must be submitted to LSF using the bsub command.

    To assist users manage their jobs and organize work plans it may be helpful to understand the factors affecting how jobs are selected for scheduling. This information is current as of May 2004, but is subject to change in response to changing work load on the cluster.

    Job Priority
    Time of Submission

    All other factors being equal jobs are scheduled on a first in, first out basis. However, all other factors are nearly never equal, so time of submission turns out to be the least important priority for scheduling jobs.

    Queue Priority

    Each LSF queue is assigned a priority. Jobs in higher priority queues tend to be scheduled first. Queue priority tends to have the most effect for selecting between jobs belonging to users within the same project. Queue priority is shown by the bqueues command - in fact, queues are listed in decending priority order by the bqueues command.

    Fair Share Priority

    Each project is assigned a base share. The project's share priority is reduced based on the number of processors currently being used by the project, by the CPU time used recently by the project, and by the total run time used recently by the project. Changes in number of processors is reflected immediately in the project's share priority. That is, as soon as a job completes the share priority would increase to reflect fewer processors in use. CPU and run time adjustments decay away over time - currently on the order of a day. So a project that used a large amount of CPU time in the morning would continue to have a somewhat lower share priority throughout the afternoon.

    Share priority turns out to be the most important factor in determining which project has the highest priority for the next available job slot. Individual users within a project are selected round-robin. The bhpart command shows current share priorities and the bugroup command shows project membership of each individual user.

    Base Share

    A project's base share is computed by taking the total number of job slots on university compute nodes and dividing by the number of projects. Partner projects, that is projects of faculty who have purchased cluster nodes, have the number of processors they have purchased added to the base share for their project. Because base shares must be specified as integer values the computed base shares are normalized such that non-partner base shares are equal to one.

    Summary

    So in practice the factors tend to operate in the reverse order from which they were introduced. First the highest priority project is identified based on share priority. Next a user from that project is selected round-robin. If the user has more than one job waiting a job is selected from the highest priority queue. If the user has more than one job waiting in that queue, the job with the earliest submission time is selected and assigned the available job slot.

    This is not an absolute process. Jobs waiting a very long time can accumulate sufficient priority to overcome low queue priorities, but in most cases this is the observed LSF scheduling behavior.

Last modified: May 20 2004 16:22:36.
Copyright © 2003-2009 by NC State University and others, All Rights Reserved.
HPC & Grid (Version 1.4 / Site access count: 778666) - Site/Content Notice

Site contact: Eric Sills, E-mail: eric_sills at ncsu dot edu , Tel: 919-513-0324, Fax: 919-513-1893, HPC and Grid Operations, Information Technology Division, Box 7109, North Carolina State University, Raleigh, NC27695-7914, USA