Queueing Policy

Daturamon queue policy:

In short, there is now a share-tree policy implemented on datura as decided at the NumRel group meeting. Users now receive tickets for their jobs and the jobs are weighted according to how many tickets the job receives. The amount of tickets a job receives depends on the current actual usage of the targeted resource share of each job currently queued, although the ticket allocation is complex, the following points should clarify the main elements of the policy:

  •  At present, all users have an equal target resource; that is, all users are entitled to an equal share of the resources.
  • When a job runs or completes, the actual usage of resources for the corresponding user increases.
  • The system remembers actual resource usage with a half-life of 35 days.
  • If two jobs belonging to users with equal actual resource use are queued, the job requiring less resources should begin first since it is asking for the smaller share of the total resources.
  • If two jobs requesting equal resources belong to users with equal actual resource use, the job submitted first will be queued first.
  • Current resources are CPU time. Memory usage and I/O could be used in the future.

The reservation flag should still be used. In the future, one may assign users to projects and manage resource usage by project as well as user. The queue will need time to adjust the accounting of actual resource usage so please be patient with the new policy.We thank you in advance for your patience.For detailed information, you can read pages 130-146 in the Grid Engine administration manual (http://docs.sun.com/app/docs/doc/820-0698?a=load).

Reservation and Backfilling:
Jobs that require a large number of processors may have to wait a long time to run because it is less likely that enough free processors will become available all at the same time. To avoid this, processors can be reserved for a job by submitting with ” -R y ” in the qsub command. For example, the command ” qsub -R y big_job.sh ” submits big_job.sh with reservation. No other job can run on the reserved processors until the reserving job has finished. It is not possible to reserve resources for a specific date and time in SGE. SGE uses a technique known as backfilling to allow short jobs to use reserved resources (which are usually processors) while the reserving job is waiting to run. The time limit can be specified as a number of seconds, or in the form hh:mm:ss. For example, the command ” qsub -l h_rt=02:30:00 good_job.sh ” submits good_job.sh with a time limit of 2 hours and 30 minutes.