1QUEUE_CONF(5)              Grid Engine File Formats              QUEUE_CONF(5)
2
3
4

NAME

6       queue_conf - Grid Engine queue configuration file format
7

DESCRIPTION

9       Queue_conf  reflects  the  format  of the template file for the cluster
10       queue configuration.  You can add cluster queues and modify the config‐
11       uration  of any queue in the cluster via the -aq and -mq options of the
12       qconf(1) command.
13
14       The queue_conf parameters take as values strings, integer decimal  num‐
15       bers  or boolean, time and memory specifiers as well as comma separated
16       lists. A time specifier either consists of a positive decimal, hexadec‐
17       imal  or octal integer constant, in which case the value is interpreted
18       to be in seconds, or is built by 3 decimal integer numbers separated by
19       colon  signs  where  the  first number counts the hours, the second the
20       minutes and the third the seconds. If a number would be zero it can  be
21       left  out but the separating colon must remain (e.g. 1:0:1 = 1::1 means
22       1 hours and 1 second).
23
24       Memory specifiers are positive decimal, hexadecimal  or  octal  integer
25       constants  which  may  be followed by a multiplier letter. Valid multi‐
26       plier letters are k, K, m, M, g and G, where k means multiply the value
27       by  1000,  K  multiply  by 1024, m multiply by 1000*1000, M multiply by
28       1024*1024,  g  multiply   by   1000*1000*1000   and   G   multiply   by
29       1024*1024*1024.  If no multiplier is present, the value is just counted
30       in bytes.
31
32       If more than one host is specified under 'hostlist' (by means of a list
33       of  hosts  or with host groups) it can be desirable to have user speci‐
34       fied overrides from the setting used for each host. These overrides can
35       be expressed using the enhanced queue_conf specifier syntax.  This syn‐
36       tax builds upon the regular parameter  specifier  syntax  as  described
37       below under 'FORMAT' separately for each parameter and in the paragraph
38       above:
39
40       "["host_identifier=<parameters_specifier_syntax>"]"   [,"["host_identi‐
41       fier=<parameters_specifier_syntax>"]" ]
42
43       Even in the enhanced queue_conf specifier syntax an entry
44
45       <current_attribute_syntax>
46
47       without  brackets denoting the default setting is required and used for
48       all queue instances where no user specified  overrides  are  specified.
49       Tuples  with a host group host_identifier override the default setting.
50       Tuples with a host name host_identifier override both the  default  and
51       the  host  group  setting.  Note  also that a default setting is always
52       needed for all configuration attributes with  the  enhanced  queue_conf
53       specifier syntax.
54
55       Integrity verifications will be applied on the configuration.
56
57       Configurations without default setting are rejected.  Ambiguous config‐
58       urations with more than one attribute setting for a particular host are
59       rejected.   Configurations  containing  override  values  for hosts not
60       listed under 'hostname' are accepted but  are  indicated  by  -sick  of
61       qconf(1)  The cluster queue should contain an unambiguous specification
62       for each configuration attribute of each queue instance specified under
63       hostname  in  queue_conf.  Ambiguous  configurations with more than one
64       attribute setting resulting from overlapping host groups are  indicated
65       by  -explain  c of qstat(1) and cause the queue instance with ambiguous
66       configurations to enter the c(onfiguration ambiguous) state.
67

FORMAT

69       The following list of queue_conf parameters  specifies  the  queue_conf
70       content:
71
72   qname
73       The name of the cluster queue(type string; template default: template).
74
75   hostlist
76       A list of host names (fully-qualified) and host group names. Host group
77       names must begin with an "@" sign. If multiple hosts are specified  the
78       queue_conf constitutes multiple queue instances. Each host may be spec‐
79       ified only once in this list. (type string; template default: NONE).
80
81   seq_no
82       This parameter, in conjunction with  the  hosts  load,  specifies  this
83       queue's position in the scheduling order within the suitable queues for
84       a job to be dispatched under  consideration  of  the  queue_sort_method
85       (see sched_conf(5) ).
86
87       Regardless  of  the  queue_sort_method  setting, qstat(1) reports queue
88       information in the order defined by the value of the seq_no.  Set  this
89       parameter  to  a  monotonically increasing sequence. (type number; tem‐
90       plate default: 0).
91
92   load_thresholds
93       load_thresholds is a list of load thresholds. If one of the  thresholds
94       is exceeded no further jobs will be scheduled to the queues and qmon(1)
95       will signal an overload condition for this node. Arbitrary load  values
96       being  defined in the "host" and "global" complexes (see complex(5) for
97       details) can be used.
98
99       The syntax is that of a comma separated list  with  each  list  element
100       consisting of the name of a load value, an equal sign and the threshold
101       value  being  intended  to  trigger  the   overload   situation   (e.g.
102       load_avg=1.75,users_logged_in=5).
103
104       Note: Load values as well as consumable resources may be scaled differ‐
105       ently for different hosts if specified in the  corresponding  execution
106       host  definitions  (refer  to  host_conf(5) for more information). Load
107       thresholds are compared against the scaled load and consumable values.
108
109   suspend_thresholds
110       A list of load thresholds with  the  same  semantics  as  that  of  the
111       load_thresholds  parameter (see above) except that exceeding one of the
112       denoted thresholds initiates suspension of one of multiple jobs in  the
113       queue.   See  the nsuspend parameter below for details on the number of
114       jobs which are suspended.
115
116   nsuspend
117       The number of jobs which are suspended/enabled per time interval if  at
118       least  one  of  the  load  thresholds in the suspend_thresholds list is
119       exceeded or if no suspend_threshold is violated  anymore  respectively.
120       Nsuspend  jobs  are  suspended  in  each  time  interval  until no sus‐
121       pend_thresholds are exceeded anymore or all jobs in the queue are  sus‐
122       pended.  Jobs  are  enabled  in  the  corresponding  way  if  the  sus‐
123       pend_thresholds are no longer exceeded.  The time interval in which the
124       suspensions of the jobs occur is defined in suspend_interval below.
125
126   suspend_interval
127       The  time  interval in which further nsuspend jobs are suspended if one
128       of the suspend_thresholds (see above for both) is exceeded by the  cur‐
129       rent load on the host on which the queue is located.  The time interval
130       is also used when enabling the jobs.
131
132   priority
133       The priority parameter specifies the nice(2) value  at  which  jobs  in
134       this  queue  will  be  run.  The type is number and the default is zero
135       (which means no nice value is set explicitly). Negative values  (up  to
136       -20) correspond to a higher scheduling priority, positive values (up to
137       +20) correspond to a lower scheduling priority.
138
139       Note: the value of priority has no effect, if Grid Engine adjusts  pri‐
140       orities dynamically to implement ticket-based entitlement policy goals.
141       Dynamic  priority  adjustment  is  switched   off  by  default  due  to
142       sge_conf(5) reprioritize being set to false.
143
144   min_cpu_interval
145       The  time  between  two  automatic checkpoints in case of transparently
146       checkpointing jobs. The maximum of the time requested by the  user  via
147       qsub(1)  and  the  time  defined  by the queue configuration is used as
148       checkpoint interval. Since checkpoint files may be  considerably  large
149       and  thus  writing  them to the file system may become expensive, users
150       and administrators are advised to choose sufficiently large time inter‐
151       vals.  min_cpu_interval  is  of  type time and the default is 5 minutes
152       (which usually is suitable for test purposes only).
153
154   processors
155       A set of processors in case of a multiprocessor execution host  can  be
156       defined  to which the jobs executing in this queue are bound. The value
157       type of this parameter is a range description  like  that  of  the  -pe
158       option  of  qsub(1)  (e.g. 1-4,8,10) denoting the processor numbers for
159       the processor group to be used. Obviously the interpretation  of  these
160       values  relies  on  operating  system  specifics  and is thus performed
161       inside sge_execd(8) running on the queue host. Therefore,  the  parsing
162       of  the  parameter  has  to be provided by the execution daemon and the
163       parameter is only passed through sge_qmaster(8) as a string.
164
165       Currently, support is only provided  for  SGI  multiprocessor  machines
166       running  IRIX 6.2 and Digital UNIX multiprocessor machines. In the case
167       of Digital UNIX only one job per processor set is allowed to execute at
168       the  same  time,  i.e.   slots  (see above) should be set to 1 for this
169       queue.
170
171   qtype
172       The type of queue. Currently batch, interactive or a combination  in  a
173       comma separated list or NONE.
174
175       The formerly supported types parallel and checkpointing are not allowed
176       anymore. A queue instance is implicitly of type  parallel/checkpointing
177       if  there is a parallel environment or a checkpointing interface speci‐
178       fied for this queue instance in pe_list/ckpt_list.   Formerly  possible
179       settings e.g.
180
181       qtype   PARALLEL
182
183       could be transferred into
184
185       qtype   NONE
186       pe_list pe_name
187
188       (type string; default: batch interactive).
189
190   pe_list
191       The  list  of administrator-defined parallel environments to be associ‐
192       ated with the queue. The default is NONE.
193
194   ckpt_list
195       The list of administrator-defined checkpointing interfaces to be  asso‐
196       ciated with the queue. The default is NONE.
197
198   rerun
199       Defines a default behavior for jobs which are aborted by system crashes
200       or manual "violent" (via kill(1)) shutdown of the complete Grid  Engine
201       system  (including  the  sge_shepherd(8)  of the jobs and their process
202       hierarchy) on the queue host. As soon as sge_execd(8) is restarted  and
203       detects  that  a  job  has  been  aborted  for  such  reasons it can be
204       restarted if the jobs are restartable. A job may  not  be  restartable,
205       for  example,  if  it updates databases (first reads then writes to the
206       same record of a database/file) because the abortion  of  the  job  may
207       have  left the database in an inconsistent state. If the owner of a job
208       wants to overrule the default behavior for the jobs in the queue the -r
209       option of qsub(1) can be used.
210
211       The type of this parameter is boolean, thus either TRUE or FALSE can be
212       specified. The default is FALSE, i.e. do  not  restart  jobs  automati‐
213       cally.
214
215   slots
216       The maximum number of concurrently executing jobs allowed in the queue.
217       Type is number.
218
219   tmpdir
220       The tmpdir parameter specifies the absolute path to  the  base  of  the
221       temporary  directory  filesystem.  When sge_execd(8) launches a job, it
222       creates a uniquely-named directory in this filesystem for  the  purpose
223       of  holding scratch files during job execution. At job completion, this
224       directory and its contents are removed automatically.  The  environment
225       variables  TMPDIR  and  TMP  are  set  to the path of each jobs scratch
226       directory (type string; default: /tmp).
227
228   shell
229       If either posix_compliant or  script_from_stdin  is  specified  as  the
230       shell_start_mode parameter in sge_conf(5) the shell parameter specifies
231       the executable path of the command interpreter (e.g.  sh(1) or  csh(1))
232       to  be used to process the job scripts executed in the queue. The defi‐
233       nition of shell can be overruled by the job owner via  the  qsub(1)  -S
234       option.
235
236       The type of the parameter is string. The default is /bin/csh.
237
238   shell_start_mode
239       This parameter defines the mechanisms which are used to actually invoke
240       the job scripts on the execution hosts. The following values are recog‐
241       nized:
242
243       unix_behavior
244              If  a user starts a job shell script under UNIX interactively by
245              invoking it just with the script  name  the  operating  system's
246              executable  loader  uses  the  information provided in a comment
247              such as `#!/bin/csh' in the first line of the script  to  detect
248              which command interpreter to start to interpret the script. This
249              mechanism  is  used  by  Grid  Engine  when  starting  jobs   if
250              unix_behavior is defined as shell_start_mode.
251
252       posix_compliant
253              POSIX  does  not  consider  first  script  line  comments such a
254              `#!/bin/csh' as being significant. The POSIX standard for  batch
255              queuing  systems (P1003.2d) therefore requires a compliant queu‐
256              ing system to ignore such lines but to  use  user  specified  or
257              configured   default  command  interpreters  instead.  Thus,  if
258              shell_start_mode is set  to  posix_compliant  Grid  Engine  will
259              either use the command interpreter indicated by the -S option of
260              the qsub(1) command or the shell parameter of the  queue  to  be
261              used (see above).
262
263       script_from_stdin
264              Setting the shell_start_mode parameter either to posix_compliant
265              or unix_behavior requires you  to  set  the  umask  in  use  for
266              sge_execd(8)  such  that  every  user  has  read  access  to the
267              active_jobs directory in the spool directory of the  correspond‐
268              ing execution daemon. In case you have prolog and epilog scripts
269              configured, they also need to be readable by any  user  who  may
270              execute jobs.
271              If  this  violates your site's security policies you may want to
272              set shell_start_mode to script_from_stdin. This will force  Grid
273              Engine  to  open the job script as well as the epilogue and pro‐
274              logue scripts for reading into STDIN as  root  (if  sge_execd(8)
275              was  started  as  root)  before changing to the job owner's user
276              account.  The script is then fed into the STDIN  stream  of  the
277              command  interpreter  indicated  by the -S option of the qsub(1)
278              command or the shell parameter of the  queue  to  be  used  (see
279              above).
280              Thus  setting shell_start_mode to script_from_stdin also implies
281              posix_compliant behavior. Note, however,  that  feeding  scripts
282              into the STDIN stream of a command interpreter may cause trouble
283              if commands like rsh(1) are invoked inside a job script as  they
284              also  process the STDIN stream of the command interpreter. These
285              problems can usually be resolved by redirecting the STDIN  chan‐
286              nel of those commands to come from /dev/null (e.g. rsh host date
287              < /dev/null). Note also, that any command-line  options  associ‐
288              ated  with  the job are passed to the executing shell. The shell
289              will only forward them to the job if they are not recognized  as
290              valid shell options.
291
292       The  default  for  shell_start_mode  is posix_compliant.  Note, though,
293       that the shell_start_mode can only used for  batch  jobs  submitted  by
294       qsub and can not used for interactive jobs submitted by qrsh, qsh, qlo‐
295       gin.
296
297   prolog
298       The executable path of a shell script that is started before  execution
299       of  Grid  Engine jobs with the same environment setting as that for the
300       Grid Engine jobs to be started afterwards. An optional  prefix  "user@"
301       specifies  the  user  under  which this procedure is to be started. The
302       procedures standard output and the error output stream are  written  to
303       the  same  file  used  also for the standard output and error output of
304       each job.  This procedure is intended as a means for  the  Grid  Engine
305       administrator  to automate the execution of general site specific tasks
306       like the preparation of temporary file systems with the  need  for  the
307       same  context  information  as  the job. This queue configuration entry
308       overwrites cluster global or execution host specific prolog definitions
309       (see sge_conf(5)).
310
311       The  default  for prolog is the special value NONE, which prevents from
312       execution of a prologue script.  The  special variables for  constitut‐
313       ing a command line are the same like in prolog definitions of the clus‐
314       ter configuration (see sge_conf(5)).
315
316   epilog
317       The executable path of a shell script that is started  after  execution
318       of  Grid  Engine jobs with the same environment setting as that for the
319       Grid Engine jobs that has just completed.  An optional  prefix  "user@"
320       specifies  the  user  under  which this procedure is to be started. The
321       procedures standard output and the error output stream are  written  to
322       the  same  file  used  also for the standard output and error output of
323       each job. This procedure is intended as a means  for  the  Grid  Engine
324       administrator  to automate the execution of general site specific tasks
325       like the cleaning up of temporary file systems with the  need  for  the
326       same  context  information  as  the job. This queue configuration entry
327       overwrites cluster global or execution host specific epilog definitions
328       (see sge_conf(5)).
329
330       The  default  for epilog is the special value NONE, which prevents from
331       execution of a epilogue script.  The  special variables for  constitut‐
332       ing a command line are the same like in prolog definitions of the clus‐
333       ter configuration (see sge_conf(5)).
334
335
336   starter_method
337       The specified executable path will be used as a  job  starter  facility
338       responsible  for starting batch jobs.  The executable path will be exe‐
339       cuted instead of the configured shell to start the job. The  job  argu‐
340       ments  will  be  passed  as arguments to the job starter. The following
341       environment variables are used to pass information to the  job  starter
342       concerning  the  shell environment which was configured or requested to
343       start the job.
344
345
346       SGE_STARTER_SHELL_PATH
347              The name of the requested shell to start the job
348
349       SGE_STARTER_SHELL_START_MODE
350              The configured shell_start_mode
351
352       SGE_STARTER_USE_LOGIN_SHELL
353              Set to "true" if the shell is supposed to be  used  as  a  login
354              shell (see login_shells in sge_conf(5))
355
356       The  starter_method  will not be invoked for qsh, qlogin or qrsh acting
357       as rlogin.
358
359
360   suspend_method
361   resume_method
362   terminate_method
363       These parameters can be used for overwriting the default method used by
364       Grid Engine for suspension, release of a suspension and for termination
365       of a job. By default, the signals  SIGSTOP,  SIGCONT  and  SIGKILL  are
366       delivered to the job to perform these actions. However, for some appli‐
367       cations this is not appropriate.
368
369       If no executable path is given, Grid Engine takes the specified parame‐
370       ter  entries  as the signal to be delivered instead of the default sig‐
371       nal. A signal must be either a positive number or a  signal  name  with
372       "SIG"  as  prefix  and  the  signal  name  as  printed by kill -l (e.g.
373       SIGTERM).
374
375       If an executable path is given (it must be an  absolute  path  starting
376       with a "/") then this command together with its arguments is started by
377       Grid Engine to perform the appropriate action.  The  following  special
378       variables  are  expanded  at runtime and can be used (besides any other
379       strings which have to be interpreted by the procedures) to constitute a
380       command line:
381
382
383       $host  The name of the host on which the procedure is started.
384
385       $job_owner
386              The user name of the job owner.
387
388       $job_id
389              Grid Engine's unique job identification number.
390
391       $job_name
392              The name of the job.
393
394       $queue The name of the queue.
395
396       $job_pid
397              The pid of the job.
398
399
400   notify
401       The  time to wait between delivery of SIGUSR1/SIGUSR2 notification sig‐
402       nals and suspend/kill signals if job was  submitted  with  the  qsub(1)
403       -notify option.
404
405   owner_list
406       The  owner_list  names  the  login names (in a comma separated list) of
407       those users who are  authorized  to  disable  and  suspend  this  queue
408       through  qmod(1)  (Grid  Engine  operators  and managers can do this by
409       default). It is customary to set this field for queues  on  interactive
410       workstations  where the computing resources are shared between interac‐
411       tive sessions and Grid Engine jobs, allowing the workstation  owner  to
412       have priority access (type string; default: NONE).
413
414   user_lists
415       The  user_lists  parameter contains a comma separated list of so called
416       user access lists as described in access_list(5).  Each user  contained
417       in  at least one of the listed access lists has access to the queue. If
418       the user_lists parameter is set to NONE  (the  default)  any  user  has
419       access  being  not  explicitly  excluded  via the xuser_lists parameter
420       described below.  If a user is contained both in an access list  listed
421       in xuser_lists and user_lists the user is denied access to the queue.
422
423   xuser_lists
424       The  xuser_lists parameter contains a comma separated list of so called
425       user access lists as described in access_list(5).  Each user  contained
426       in at least one of the listed access lists is not allowed to access the
427       queue. If the xuser_lists parameter is set to NONE  (the  default)  any
428       user  has access.  If a user is contained both in an access list listed
429       in xuser_lists and user_lists the user is denied access to the queue.
430
431   projects
432       The projects parameter contains a comma separated list of projects that
433       have  access  to  the  queue.  Any projects not in this list are denied
434       access to the queue. If set to NONE  (the  default),  any  project  has
435       access  that  is  not specifically excluded via the xprojects parameter
436       described below. If a project is in both  the  projects  and  xprojects
437       parameters, the project is denied access to the queue.
438
439   xprojects
440       The  xprojects  parameter  contains  a comma separated list of projects
441       that are denied access to the queue. If set to NONE (the  default),  no
442       projects  are denied access other than those denied access based on the
443       projects parameter described above.   If  a  project  is  in  both  the
444       projects  and xprojects parameters, the project is denied access to the
445       queue.
446
447   subordinate_list
448       A list of Grid Engine cluster queues. Subordinate relationships are  in
449       effect only between queue instances residing at the same host. If there
450       is a queue instance (be in the sub- or superordinated one) on only  one
451       particular host this relationship is ignored.  Queue instances residing
452       on the same host will be suspended when a specified count  of  jobs  is
453       running  in this queue instance.  The list specification is the same as
454       that of the load_thresholds parameter above, e.g.  low_pri_q=5,small_q.
455       The numbers denote the job slots of the queue that have to be filled to
456       trigger the suspension of  the  subordinated  queue.  If  no  value  is
457       assigned  a  suspension  is  triggered  if  all  slots of the queue are
458       filled.
459
460       On nodes which host more than one queue, you might wish to accord  bet‐
461       ter service to certain classes of jobs (e.g., queues that are dedicated
462       to parallel processing might need priority over low priority production
463       queues; default: NONE).
464
465   complex_values
466       complex_values  defines quotas for resource attributes managed via this
467       queue. The syntax is the same as for load_thresholds (see  above).  The
468       quotas  are  related to the resource consumption of all jobs in a queue
469       in the case of consumable resources (see complex(5) for details on con‐
470       sumable  resources)  or  they  are interpreted on a per queue slot (see
471       slots above) basis in the case of non-consumable resources.  Consumable
472       resource  attributes are commonly used to manage free memory, free disk
473       space or available  floating  software  licenses  while  non-consumable
474       attributes  usually  define  distinctive  characteristics  like type of
475       hardware installed.
476
477       For consumable resource attributes  an  available  resource  amount  is
478       determined  by subtracting the current resource consumption of all run‐
479       ning jobs in the queue from the quota in the complex_values list.  Jobs
480       can  only  be  dispatched to a queue if no resource requests exceed any
481       corresponding resource availability obtained by this scheme. The  quota
482       definition  in the complex_values list is automatically replaced by the
483       current load value reported for this attribute, if  load  is  monitored
484       for this resource and if the reported load value is more stringent than
485       the quota. This effectively avoids oversubscription of resources.
486
487       Note: Load values replacing the quota specifications  may  have  become
488       more  stringent because they have been scaled (see host_conf(5)) and/or
489       load adjusted (see sched_conf(5)).  The -F option of qstat(1)  and  the
490       load display in the qmon(1) queue control dialog (activated by clicking
491       on a queue icon while the "Shift"  key  is  pressed)  provide  detailed
492       information  on  the actual availability of consumable resources and on
493       the origin of the values taken into account currently.
494
495       Note also: The resource consumption  of  running  jobs  (used  for  the
496       availability  calculation) as well as the resource requests of the jobs
497       waiting to be dispatched either  may  be  derived  from  explicit  user
498       requests during job submission (see the -l option to qsub(1)) or from a
499       "default" value configured for an attribute by the  administrator  (see
500       complex(5)).  The -r option to qstat(1) can be used for retrieving full
501       detail on the actual resource requests of all jobs in the system.
502
503       For non-consumable resources Grid  Engine  simply  compares  the  job's
504       attribute requests with the corresponding specification in complex_val‐
505       ues taking the relation operator of the  complex  attribute  definition
506       into  account  (see  complex(5)).   If  the result of the comparison is
507       "true", the queue is suitable for the job with respect to the  particu‐
508       lar  attribute.  For  parallel jobs each queue slot to be occupied by a
509       parallel task is meant to provide the same resource attribute value.
510
511       Note: Only numeric complex attributes  can  be  defined  as  consumable
512       resources  and hence non-numeric attributes are always handled on a per
513       queue slot basis.
514
515       The default value for this parameter is  NONE,  i.e.  no  administrator
516       defined resource attribute quotas are associated with the queue.
517
518   calendar
519       specifies the calendar to be valid for this queue or contains NONE (the
520       default). A calendar defines the availability of a queue  depending  on
521       time  of  day,  week  and  year.  Please  refer to calendar_conf(5) for
522       details on the Grid Engine calendar facility.
523
524       Note: Jobs can request queues with a certain calendar model via  a  "-l
525       c=<cal_name>" option to qsub(1).
526
527   initial_state
528       defines  an initial state for the queue either when adding the queue to
529       the system for the first time or on start-up of the sge_execd(8) on the
530       host on which the queue resides. Possible values are:
531
532       default   The queue is enabled when adding the queue or is reset to the
533                 previous status when sge_execd(8) comes up (this  corresponds
534                 to  the behavior in earlier Grid Engine releases not support‐
535                 ing initial_state).
536
537       enabled   The queue is enabled in either case. This is equivalent to  a
538                 manual and explicit 'qmod -e' command (see qmod(1)).
539
540       disabled  The  queue is disable in either case. This is equivalent to a
541                 manual and explicit 'qmod -d' command (see qmod(1)).
542

RESOURCE LIMITS

544       The first two resource limit parameters, s_rt and h_rt, are implemented
545       by Grid Engine. They define the "real time" or also called "elapsed" or
546       "wall clock" time which has passed since the start of the job. If  h_rt
547       is  exceeded  by  a  job  running  in  the queue, it is aborted via the
548       SIGKILL signal (see kill(1)).  If s_rt is exceeded, the  job  is  first
549       "warned"  via  the  SIGUSR1 signal (which can be caught by the job) and
550       finally aborted after the notification time defined in the  queue  con‐
551       figuration parameter notify (see above) has passed.
552
553       The  resource  limit parameters s_cpu and h_cpu are implemented by Grid
554       Engine as job limits. They impose a limit on the amount of combined CPU
555       time consumed by all the processes in the job.  If h_cpu is exceeded by
556       a job running in the queue, it is aborted via  a  SIGKILL  signal  (see
557       kill(1)).  If s_cpu is exceeded, the job is sent a SIGXCPU signal which
558       can be caught by the job.  If you wish to allow a job to be "warned" so
559       it  can  exit  gracefully  before  it is killed then you should set the
560       s_cpu limit to a lower value than h_cpu.  For parallel  processes,  the
561       limit  is  applied per slot which means that the limit is multiplied by
562       the number of slots being used by the job before being applied.
563
564       The resource limit parameters s_vmem and h_vmem are implemented by Grid
565       Engine  as  a job limit.  They impose a limit on the amount of combined
566       virtual memory consumed by all the processes in the job. If  h_vmem  is
567       exceeded  by  a  job  running in the queue, it is aborted via a SIGKILL
568       signal (see kill(1)).  If s_vmem is exceeded, the job is sent a SIGXCPU
569       signal  which  can be caught by the job.  If you wish to allow a job to
570       be "warned" so it can exit gracefully before  it  is  killed  then  you
571       should set the s_vmem limit to a lower value than h_vmem.  For parallel
572       processes, the limit is applied per slot which means that the limit  is
573       multiplied  by  the  number of slots being used by the job before being
574       applied.
575
576       The remaining parameters in the queue  configuration  template  specify
577       per  job  soft  and  hard  resource  limits as implemented by the setr‐
578       limit(2) system call. See this manual page  on  your  system  for  more
579       information.   By  default,  each limit field is set to infinity (which
580       means RLIM_INFINITY as described in the setrlimit(2) manual page).  The
581       value  type  for the CPU-time limits s_cpu and h_cpu is time. The value
582       type for the other limits is memory.  Note:  Not  all  systems  support
583       setrlimit(2).
584
585       Note  also:  s_vmem  and  h_vmem (virtual memory) are only available on
586       systems supporting RLIMIT_VMEM (see setrlimit(2) on your operating sys‐
587       tem).
588
589       The  UNICOS  operating system supplied by SGI/Cray does not support the
590       setrlimit(2) system call, using their own resource limit-setting system
591       call instead.  For UNICOS systems only, the following meanings apply:
592
593       s_cpu     The per-process CPU time limit in seconds.
594
595       s_core    The per-process maximum core file size in bytes.
596
597       s_data    The per-process maximum memory limit in bytes.
598
599       s_vmem    The same as s_data (if both are set the minimum is used).
600
601       h_cpu     The per-job CPU time limit in seconds.
602
603       h_data    The per-job maximum memory limit in bytes.
604
605       h_vmem    The same as h_data (if both are set the minimum is used).
606
607       h_fsize   The total number of disk blocks that this job can create.
608

SEE ALSO

610       sge_intro(1),   csh(1),   qconf(1),   qmon(1),  qrestart(1),  qstat(1),
611       qsub(1),   sh(1),   nice(2),   setrlimit(2),   access_list(5),   calen‐
612       dar_conf(5),   sge_conf(5),  complex(5),  host_conf(5),  sched_conf(5),
613       sge_execd(8), sge_qmaster(8), sge_shepherd(8).
614
616       See sge_intro(1) for a full statement of rights and permissions.
617
618
619
620GE 6.1                   $Date: 2007/07/19 08:17:17 $            QUEUE_CONF(5)
Impressum