1sacctmgr(1) Slurm Commands sacctmgr(1)
2
3
4
6 sacctmgr - Used to view and modify Slurm account information.
7
8
10 sacctmgr [OPTIONS...] [COMMAND...]
11
12
14 sacctmgr is used to view or modify Slurm account information. The ac‐
15 count information is maintained within a database with the interface
16 being provided by slurmdbd (Slurm Database daemon). This database can
17 serve as a central storehouse of user and computer information for mul‐
18 tiple computers at a single site. Slurm account information is
19 recorded based upon four parameters that form what is referred to as an
20 association. These parameters are user, cluster, partition, and ac‐
21 count. user is the login name. cluster is the name of a Slurm managed
22 cluster as specified by the ClusterName parameter in the slurm.conf
23 configuration file. partition is the name of a Slurm partition on that
24 cluster. account is the bank account for a job. The intended mode of
25 operation is to initiate the sacctmgr command, add, delete, modify,
26 and/or list association records then commit the changes and exit.
27
28 NOTE: The contents of Slurm's database are maintained in lower case.
29 This may result in some sacctmgr output differing from that of other
30 Slurm commands.
31
32
34 -s, --associations
35 Use with show or list to display associations with the entity.
36 This is equivalent to the associations command.
37
38 -h, --help
39 Print a help message describing the usage of sacctmgr. This is
40 equivalent to the help command.
41
42 -i, --immediate
43 commit changes immediately without asking for confirmation.
44
45 -n, --noheader
46 No header will be added to the beginning of the output.
47
48 -p, --parsable
49 Output will be '|' delimited with a '|' at the end.
50
51 -P, --parsable2
52 Output will be '|' delimited without a '|' at the end.
53
54 -Q, --quiet
55 Print no messages other than error messages. This is equivalent
56 to the quiet command.
57
58 -r, --readonly
59 Makes it so the running sacctmgr cannot modify accounting infor‐
60 mation. The readonly option is for use within interactive mode.
61
62 -v, --verbose
63 Enable detailed logging. This is equivalent to the verbose com‐
64 mand.
65
66 -V , --version
67 Display version number. This is equivalent to the version com‐
68 mand.
69
71 add <ENTITY> <SPECS>
72 Add an entity. Identical to the create command.
73
74 archive {dump|load} <SPECS>
75 Write database information to a flat file or load information
76 that has previously been written to a file.
77
78 clear stats
79 Clear the server statistics.
80
81 create <ENTITY> <SPECS>
82 Add an entity. Identical to the add command.
83
84 delete <ENTITY> where <SPECS>
85 Delete the specified entities. Identical to the remove command.
86
87 dump <ENTITY> [File=<FILENAME>]
88 Dump cluster data to the specified file. If the filename is not
89 specified it uses clustername.cfg filename by default.
90
91 help Display a description of sacctmgr options and commands.
92
93 list <ENTITY> [<SPECS>]
94 Display information about the specified entity. By default, all
95 entries are displayed, you can narrow results by specifying
96 SPECS in your query. Identical to the show command.
97
98 load <FILENAME>
99 Load cluster data from the specified file. This is a configura‐
100 tion file generated by running the sacctmgr dump command. This
101 command does not load archive data, see the sacctmgr archive
102 load option instead.
103
104 modify <ENTITY> where <SPECS> set <SPECS>
105 Modify an entity.
106
107 reconfigure
108 Reconfigures the SlurmDBD if running with one.
109
110 remove <ENTITY> where <SPECS>
111 Delete the specified entities. Identical to the delete command.
112
113 show <ENTITY> [<SPECS>]
114 Display information about the specified entity. By default, all
115 entries are displayed, you can narrow results by specifying
116 SPECS in your query. Identical to the list command.
117
118 shutdown
119 Shutdown the server.
120
121 version
122 Display the version number of sacctmgr.
123
125 NOTE: All commands listed below can be used in the interactive mode,
126 but NOT on the initial command line.
127
128
129 exit Terminate sacctmgr interactive mode. Identical to the quit com‐
130 mand.
131
132 quiet Print no messages other than error messages.
133
134 quit Terminate the execution of sacctmgr interactive mode. Identical
135 to the exit command.
136
137 verbose
138 Enable detailed logging. This includes time-stamps on data
139 structures, record counts, etc. This is an independent command
140 with no options meant for use in interactive mode.
141
142 !! Repeat the last command.
143
145 account
146 A bank account, typically specified at job submit time using the
147 --account= option. These may be arranged in a hierarchical
148 fashion, for example accounts 'chemistry' and 'physics' may be
149 children of the account 'science'. The hierarchy may have an
150 arbitrary depth.
151
152 association
153 The entity used to group information consisting of four parame‐
154 ters: account, cluster, partition (optional), and user. Used
155 only with the list or show command. Add, modify, and delete
156 should be done to a user, account or cluster entity, which will
157 in turn update the underlying associations. Modification of at‐
158 tributes like limits is allowed for an association but not a
159 modification of the four core attributes of an association. You
160 cannot change the partition setting (or set one if it has not
161 been set) for an existing association. Instead, you will need to
162 create a new association with the partition included. You can
163 either keep the previous association with no partition defined,
164 or delete it. Note that these newly added associations are
165 unique entities and any existing usage information will not be
166 carried over to the new association.
167
168 cluster
169 The ClusterName parameter in the slurm.conf configuration file,
170 used to differentiate accounts on different machines.
171
172 configuration
173 Used only with the list or show command to report current system
174 configuration.
175
176 coordinator
177 A special privileged user, usually an account manager, that can
178 add users or sub-accounts to the account they are coordinator
179 over. This should be a trusted person since they can change
180 limits on account and user associations, as well as cancel, re‐
181 queue or reassign accounts of jobs inside their realm.
182
183 event Events like downed or draining nodes on clusters.
184
185 federation
186 A group of clusters that work together to schedule jobs.
187
188 job Used to modify specific fields of a job: Derived Exit Code, the
189 Comment String, or wckey.
190
191 problem
192 Use with show or list to display entity problems.
193
194 qos Quality of Service.
195
196 reservation
197 A collection of resources set apart for use by a particular ac‐
198 count, user or group of users for a given period of time.
199
200 resource
201 Software resources for the system. Those are software licenses
202 shared among clusters.
203
204 RunawayJobs
205 Used only with the list or show command to report current jobs
206 that have been orphaned on the local cluster and are now run‐
207 away. If there are jobs in this state it will also give you an
208 option to "fix" them. NOTE: You must have an AdminLevel of at
209 least Operator to perform this.
210
211 stats Used with list or show command to view server statistics. Ac‐
212 cepts optional argument of ave_time or total_time to sort on
213 those fields. By default, sorts on increasing RPC count field.
214
215 transaction
216 List of transactions that have occurred during a given time pe‐
217 riod.
218
219 tres Used with list or show command to view a list of Trackable RE‐
220 Sources configured on the system.
221
222 user The login name. Usernames are case-insensitive (forced to lower‐
223 case) unless the PreserveCaseUser option has been set in the
224 SlurmDBD configuration file.
225
226 wckeys Workload Characterization Key. An arbitrary string for
227 grouping orthogonal accounts.
228
230 NOTE: The group limits (GrpJobs, GrpTRES, etc.) are tested when a job
231 is being considered for being allocated resources. If starting a job
232 would cause any of its group limit to be exceeded, that job will not be
233 considered for scheduling even if that job might preempt other jobs
234 which would release sufficient group resources for the pending job to
235 be initiated.
236
237
238 DefaultQOS=<default_qos>
239 The default QOS this association and its children should have.
240 This is overridden if set directly on a user. To clear a previ‐
241 ously set value use the modify command with a new value of -1.
242
243 Fairshare={<fairshare_number>|parent}
244 Number used in conjunction with other accounts to determine job
245 priority. Can also be the string parent, when used on a user
246 this means that the parent association is used for fairshare.
247 If Fairshare=parent is set on an account, that account's chil‐
248 dren will be effectively reparented for fairshare calculations
249 to the first parent of their parent that is not Fairshare=par‐
250 ent. Limits remain the same, only its fairshare value is af‐
251 fected. To clear a previously set value use the modify command
252 with a new value of -1.
253
254 GrpTRESMins=TRES=<minutes>[,TRES=<minutes>,...]
255 The total number of TRES minutes that can possibly be used by
256 past, present and future jobs running from this association and
257 its children. To clear a previously set value use the modify
258 command with a new value of -1 for each TRES id.
259
260 NOTE: This limit is not enforced if set on the root association
261 of a cluster. So even though it may appear in sacctmgr output,
262 it will not be enforced.
263
264 ALSO NOTE: This limit only applies when using the Priority Mul‐
265 tifactor plugin. The time is decayed using the value of Priori‐
266 tyDecayHalfLife or PriorityUsageResetPeriod as set in the
267 slurm.conf. When this limit is reached all associated jobs run‐
268 ning will be killed and all future jobs submitted with associa‐
269 tions in the group will be delayed until they are able to run
270 inside the limit.
271
272 GrpTRESRunMins=TRES=<minutes>[,TRES=<minutes>,...]
273 Used to limit the combined total number of TRES minutes used by
274 all jobs running with this association and its children. This
275 takes into consideration time limit of running jobs and consumes
276 it, if the limit is reached no new jobs are started until other
277 jobs finish to allow time to free up.
278
279 GrpTRES=TRES=<max_TRES>[,TRES=<max_TRES>,...]
280 Maximum number of TRES running jobs are able to be allocated in
281 aggregate for this association and all associations which are
282 children of this association. To clear a previously set value
283 use the modify command with a new value of -1 for each TRES id.
284
285 NOTE: This limit only applies fully when using the Select Con‐
286 sumable Resource plugin.
287
288 GrpJobs=<max_jobs>
289 Maximum number of running jobs in aggregate for this association
290 and all associations which are children of this association. To
291 clear a previously set value use the modify command with a new
292 value of -1.
293
294 GrpJobsAccrue=<max_jobs>
295 Maximum number of pending jobs in aggregate able to accrue age
296 priority for this association and all associations which are
297 children of this association. To clear a previously set value
298 use the modify command with a new value of -1.
299
300 GrpSubmitJobs=<max_jobs>
301 Maximum number of jobs which can be in a pending or running
302 state at any time in aggregate for this association and all as‐
303 sociations which are children of this association. To clear a
304 previously set value use the modify command with a new value of
305 -1.
306
307 NOTE: This setting shows up in the sacctmgr output as GrpSubmit.
308
309 GrpWall=<max_wall>
310 Maximum wall clock time running jobs are able to be allocated in
311 aggregate for this association and all associations which are
312 children of this association. To clear a previously set value
313 use the modify command with a new value of -1.
314
315 NOTE: This limit is not enforced if set on the root association
316 of a cluster. So even though it may appear in sacctmgr output,
317 it will not be enforced.
318
319 ALSO NOTE: This limit only applies when using the Priority Mul‐
320 tifactor plugin. The time is decayed using the value of Priori‐
321 tyDecayHalfLife or PriorityUsageResetPeriod as set in the
322 slurm.conf. When this limit is reached all associated jobs run‐
323 ning will be killed and all future jobs submitted with associa‐
324 tions in the group will be delayed until they are able to run
325 inside the limit.
326
327 MaxTRESMinsPerJob=TRES=<minutes>[,TRES=<minutes>,...]
328 Maximum number of TRES minutes each job is able to use in this
329 association. This is overridden if set directly on a user. De‐
330 fault is the cluster's limit. To clear a previously set value
331 use the modify command with a new value of -1 for each TRES id.
332
333 NOTE: This setting shows up in the sacctmgr output as Max‐
334 TRESMins.
335
336 MaxTRESPerJob=TRES=<max_TRES>[,TRES=<max_TRES>,...]
337 Maximum number of TRES each job is able to use in this associa‐
338 tion. This is overridden if set directly on a user. Default is
339 the cluster's limit. To clear a previously set value use the
340 modify command with a new value of -1 for each TRES id.
341
342 NOTE: This setting shows up in the sacctmgr output as MaxTRES.
343
344 NOTE: This limit only applies fully when using cons_res or
345 cons_tres select type plugins.
346
347 MaxJobs=<max_jobs>
348 Maximum number of jobs each user is allowed to run at one time
349 in this association. This is overridden if set directly on a
350 user. Default is the cluster's limit. To clear a previously
351 set value use the modify command with a new value of -1.
352
353 MaxJobsAccrue=<max_jobs>
354 Maximum number of pending jobs able to accrue age priority at
355 any given time for the given association. This is overridden if
356 set directly on a user. Default is the cluster's limit. To
357 clear a previously set value use the modify command with a new
358 value of -1.
359
360 MaxSubmitJobs=<max_jobs>
361 Maximum number of jobs which can this association can have in a
362 pending or running state at any time. Default is the cluster's
363 limit. To clear a previously set value use the modify command
364 with a new value of -1.
365
366 NOTE: This setting shows up in the sacctmgr output as MaxSubmit.
367
368 MaxWallDurationPerJob=<max_wall>
369 Maximum wall clock time each job is able to use in this associa‐
370 tion. This is overridden if set directly on a user. Default is
371 the cluster's limit. <max wall> format is <min> or <min>:<sec>
372 or <hr>:<min>:<sec> or <days>-<hr>:<min>:<sec> or <days>-<hr>.
373 The value is recorded in minutes with rounding as needed. To
374 clear a previously set value use the modify command with a new
375 value of -1.
376
377 NOTE: Changing this value will have no effect on any running or
378 pending job.
379
380 NOTE: This setting shows up in the sacctmgr output as MaxWall.
381
382 Priority
383 What priority will be added to a job's priority when using this
384 association. This is overridden if set directly on a user. De‐
385 fault is the cluster's limit. To clear a previously set value
386 use the modify command with a new value of -1.
387
388 QosLevel<operator><comma_separated_list_of_qos_names>
389 Specify the default Quality of Service's that jobs are able to
390 run at for this association. To get a list of valid QOS's use
391 'sacctmgr list qos'. This value will override its parents value
392 and push down to its children as the new default. Setting a
393 QosLevel to '' (two single quotes with nothing between them) re‐
394 stores its default setting. You can also use the operator +=
395 and -= to add or remove certain QOS's from a QOS list.
396
397 Valid <operator> values include:
398
399 =
400 Set QosLevel to the specified value. Note: the QOS that can
401 be used at a given account in the hierarchy are inherited
402 by the children of that account. By assigning QOS with the
403 = sign only the assigned QOS can be used by the account and
404 its children.
405 +=
406 Add the specified <qos> value to the current QosLevel.
407 The account will have access to this QOS and the other
408 previously assigned to it.
409 -=
410 Remove the specified <qos> value from the current
411 QosLevel.
412
413
414 See the EXAMPLES section below.
415
417 Cluster=<cluster>
418 Specific cluster to add account to. Default is all in system.
419
420 Description=<description>
421 An arbitrary string describing an account.
422
423 Name=<name>
424 The name of a bank account. Note the name must be unique and
425 can not be represent different bank accounts at different points
426 in the account hierarchy.
427
428 Organization=<org>
429 Organization to which the account belongs.
430
431 Parent=<parent>
432 Parent account of this account. Default is the root account, a
433 top level account.
434
435 RawUsage=<value>
436 This allows an administrator to reset the raw usage accrued to
437 an account. The only value currently supported is 0 (zero).
438 This is a settable specification only - it cannot be used as a
439 filter to list accounts.
440
441 WithAssoc
442 Display all associations for this account.
443
444 WithCoord
445 Display all coordinators for this account.
446
447 WithDeleted
448 Display information with previously deleted data.
449
450 NOTE: If using the WithAssoc option you can also query against associa‐
451 tion specific information to view only certain associations this ac‐
452 count may have. These extra options can be found in the SPECIFICATIONS
453 FOR ASSOCIATIONS section. You can also use the general specifications
454 list above in the GENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES
455 section.
456
458 Account
459 The name of a bank account.
460
461 Description
462 An arbitrary string describing an account.
463
464 Organization
465 Organization to which the account belongs.
466
467 Coordinators
468 List of users that are a coordinator of the account. (Only
469 filled in when using the WithCoordinator option.)
470
471 NOTE: If using the WithAssoc option you can also view the information
472 about the various associations the account may have on all the clusters
473 in the system. The association information can be filtered. Note that
474 all the accounts in the database will always be shown as filter only
475 takes effect over the association data. The Association format fields
476 are described in the LIST/SHOW ASSOCIATION FORMAT OPTIONS section.
477
479 Clusters=<cluster_name>[,<cluster_name>,...]
480 List the associations of the cluster(s).
481
482 Accounts=<account_name>[,<account_name>,...]
483 List the associations of the account(s).
484
485 Users=<user_name>[,<user_name>,...]
486 List the associations of the user(s).
487
488 Partition=<partition_name>[,<partition_name>,...]
489 List the associations of the partition(s).
490
491 NOTE: You can also use the general specifications list above in the
492 GENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES section.
493
494 Other options unique for listing associations:
495
496 OnlyDefaults
497 Display only associations that are default associations
498
499 Tree Display account names in a hierarchical fashion.
500
501 WithDeleted
502 Display information with previously deleted data.
503
504 WithSubAccounts
505 Display information with subaccounts. Only really valuable when
506 used with the account= option. This will display all the subac‐
507 count associations along with the accounts listed in the option.
508
509 WOLimits
510 Display information without limit information. This is for a
511 smaller default format of "Cluster,Account,User,Partition".
512
513 WOPInfo
514 Display information without parent information (i.e. parent id,
515 and parent account name). This option also implicitly sets the
516 WOPLimits option.
517
518 WOPLimits
519 Display information without hierarchical parent limits (i.e.
520 will only display limits where they are set instead of propagat‐
521 ing them from the parent).
522
524 Account
525 The name of a bank account in the association.
526
527 Cluster
528 The name of a cluster in the association.
529
530 DefaultQOS
531 The QOS the association will use by default if it as access to
532 it in the QOS list mentioned below.
533
534 Fairshare
535 Number used in conjunction with other accounts to determine job
536 priority. Can also be the string parent, when used on a user
537 this means that the parent association is used for fairshare.
538 If Fairshare=parent is set on an account, that account's chil‐
539 dren will be effectively reparented for fairshare calculations
540 to the first parent of their parent that is not Fairshare=par‐
541 ent. Limits remain the same, only its fairshare value is af‐
542 fected.
543
544 GrpTRESMins
545 The total number of TRES minutes that can possibly be used by
546 past, present and future jobs running from this association and
547 its children.
548
549 GrpTRESRunMins
550 Used to limit the combined total number of TRES minutes used by
551 all jobs running with this association and its children. This
552 takes into consideration time limit of running jobs and consumes
553 it, if the limit is reached no new jobs are started until other
554 jobs finish to allow time to free up.
555
556 GrpTRES
557 Maximum number of TRES running jobs are able to be allocated in
558 aggregate for this association and all associations which are
559 children of this association.
560
561 GrpJobs
562 Maximum number of running jobs in aggregate for this association
563 and all associations which are children of this association.
564
565 GrpJobsAccrue
566 Maximum number of pending jobs in aggregate able to accrue age
567 priority for this association and all associations which are
568 children of this association.
569
570 GrpSubmitJobs
571 Maximum number of jobs which can be in a pending or running
572 state at any time in aggregate for this association and all as‐
573 sociations which are children of this association.
574
575 NOTE: This setting shows up in the sacctmgr output as GrpSubmit.
576
577 GrpWall
578 Maximum wall clock time running jobs are able to be allocated in
579 aggregate for this association and all associations which are
580 children of this association.
581
582 ID The id of the association.
583
584 LFT Associations are kept in a hierarchy: this is the left most spot
585 in the hierarchy. When used with the RGT variable, all associa‐
586 tions with a LFT inside this LFT and before the RGT are children
587 of this association.
588
589 MaxTRESPerJob
590 Maximum number of TRES each job is able to use.
591
592 NOTE: This setting shows up in the sacctmgr output as MaxTRES.
593
594 MaxTRESMinsPerJob
595 Maximum number of TRES minutes each job is able to use.
596
597 NOTE: This setting shows up in the sacctmgr output as Max‐
598 TRESMins.
599
600 MaxTRESPerNode
601 Maximum number of TRES each node in a job allocation can use.
602
603 MaxJobs
604 Maximum number of jobs each user is allowed to run at one time.
605
606 MaxJobsAccrue
607 Maximum number of pending jobs able to accrue age priority at
608 any given time. This limit only applies to the job's QOS and
609 not the partition's QOS.
610
611 MaxSubmitJobs
612 Maximum number of jobs pending or running state at any time.
613
614 NOTE: This setting shows up in the sacctmgr output as MaxSubmit.
615
616 MaxWallDurationPerJob
617 Maximum wall clock time each job is able to use.
618
619 NOTE: This setting shows up in the sacctmgr output as MaxWall.
620
621 Qos Valid QOS' for this association.
622
623 QosRaw QOS' ID.
624
625 ParentID
626 The association id of the parent of this association.
627
628 ParentName
629 The account name of the parent of this association.
630
631 Partition
632 The name of a partition in the association.
633
634 Priority
635 What priority will be added to a job's priority when using this
636 association.
637
638 WithRawQOSLevel
639 Display QosLevel in an unevaluated raw format, consisting of a
640 comma separated list of QOS names prepended with '' (nothing),
641 '+' or '-' for the association. QOS names without +/- prepended
642 were assigned (ie, sacctmgr modify ... set QosLevel=qos_name)
643 for the entity listed or on one of its parents in the hierarchy.
644 QOS names with +/- prepended indicate the QOS was added/filtered
645 (ie, sacctmgr modify ... set QosLevel=[+-]qos_name) for the en‐
646 tity listed or on one of its parents in the hierarchy. Including
647 WOPLimits will show exactly where each QOS was assigned, added
648 or filtered in the hierarchy.
649
650 RGT Associations are kept in a hierarchy: this is the right most
651 spot in the hierarchy. When used with the LFT variable, all as‐
652 sociations with a LFT inside this RGT and after the LFT are
653 children of this association.
654
655 User The name of a user in the association.
656
658 Classification=<classification>
659 Type of machine, current classifications are capability, capac‐
660 ity and capapacity.
661
662 Features=<comma_separated_list_of_feature_names>
663 Features that are specific to the cluster. Federated jobs can be
664 directed to clusters that contain the job requested features.
665
666 Federation=<federation>
667 The federation that this cluster should be a member of. A clus‐
668 ter can only be a member of one federation at a time.
669
670 FedState=<state>
671 The state of the cluster in the federation.
672 Valid states are:
673
674 ACTIVE Cluster will actively accept and schedule federated jobs.
675
676 INACTIVE
677 Cluster will not schedule or accept any jobs.
678
679 DRAIN Cluster will not accept any new jobs and will let exist‐
680 ing federated jobs complete.
681
682 DRAIN+REMOVE
683 Cluster will not accept any new jobs and will remove it‐
684 self from the federation once all federated jobs have
685 completed. When removed from the federation, the cluster
686 will accept jobs as a non-federated cluster.
687
688 Name=<name>
689 The name of a cluster. This should be equal to the ClusterName
690 parameter in the slurm.conf configuration file for some
691 Slurm-managed cluster.
692
693 RPC=<rpc_list>
694 Comma separated list of numeric RPC values.
695
696 WithFed
697 Appends federation related columns to default format options
698 (e.g. Federation,ID,Features,FedState).
699
700 WOLimits
701 Display information without limit information. This is for a
702 smaller default format of Cluster,ControlHost,ControlPort,RPC
703
704 NOTE: You can also use the general specifications list above in the
705 GENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES section.
706
708 Classification
709 Type of machine, i.e. capability, capacity or capapacity.
710
711 Cluster
712 The name of the cluster.
713
714 ControlHost
715 When a slurmctld registers with the database the ip address of
716 the controller is placed here.
717
718 ControlPort
719 When a slurmctld registers with the database the port the con‐
720 troller is listening on is placed here.
721
722 Features
723 The list of features on the cluster (if any).
724
725 Federation
726 The name of the federation this cluster is a member of (if any).
727
728 FedState
729 The state of the cluster in the federation (if a member of one).
730
731 FedStateRaw
732 Numeric value of the name of the FedState.
733
734 Flags Attributes possessed by the cluster. Current flags include Cray,
735 External and MultipleSlurmd.
736
737 External clusters are registration only clusters. A slurmctld
738 can designate an external slurmdbd with the AccountingStorageEx‐
739 ternalHost slurm.conf option. This allows a slurmctld to regis‐
740 ter to an external slurmdbd so that clusters attached to the ex‐
741 ternal slurmdbd can communicate with the external cluster with
742 Slurm commands.
743
744 ID The ID assigned to the cluster when a member of a federation.
745 This ID uniquely identifies the cluster and its jobs in the fed‐
746 eration.
747
748 NodeCount
749 The current count of nodes associated with the cluster.
750
751 NodeNames
752 The current Nodes associated with the cluster.
753
754 PluginIDSelect
755 The numeric value of the select plugin the cluster is using.
756
757 RPC When a slurmctld registers with the database the rpc version the
758 controller is running is placed here.
759
760 TRES Trackable RESources (Billing, BB (Burst buffer), CPU, Energy,
761 GRES, License, Memory, and Node) this cluster is accounting for.
762
763
764 NOTE: You can also view the information about the root association for
765 the cluster. The Association format fields are described in the
766 LIST/SHOW ASSOCIATION FORMAT OPTIONS section.
767
769 Account=<account_name>[,<account_name>,...]
770 Account name to add this user as a coordinator to.
771
772 Names=<user_name>[,<user_name>,...]
773 Names of coordinators.
774
775 NOTE: To list coordinators use the WithCoordinator options with list
776 account or list user.
777
779 All_Clusters
780 Get information on all cluster shortcut.
781
782 All_Time
783 Get time period for all time shortcut.
784
785 Clusters=<cluster_name>[,<cluster_name>,...]
786 List the events of the cluster(s). Default is the cluster where
787 the command was run.
788
789 End=<OPT>
790 Period ending of events. Default is now.
791
792 Valid time formats are...
793 HH:MM[:SS] [AM|PM]
794 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
795 MM/DD[/YY]-HH:MM[:SS]
796 YYYY-MM-DD[THH:MM[:SS]]
797 now[{+|-}count[seconds(default)|minutes|hours|days|weeks]]
798
799 Event=<OPT>
800 Specific events to look for, valid options are Cluster or Node,
801 default is both.
802
803 MaxTRES=<OPT>
804 Max number of TRES affected by an event.
805
806 MinTRES=<OPT>
807 Min number of TRES affected by an event.
808
809 Nodes=<node_name>[,<node_name>,...]
810 Node names affected by an event.
811
812 Reason=<reason>[,<reason>,...]
813 Reason an event happened.
814
815 Start=<OPT>
816 Period start of events. Default is 00:00:00 of previous day,
817 unless states are given with the States= spec events. If this
818 is the case the default behavior is to return events currently
819 in the states specified.
820
821 Valid time formats are...
822 HH:MM[:SS] [AM|PM]
823 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
824 MM/DD[/YY]-HH:MM[:SS]
825 YYYY-MM-DD[THH:MM[:SS]]
826 now[{+|-}count[seconds(default)|minutes|hours|days|weeks]]
827
828 States=<state>[,<state>,...]
829 State of a node in a node event. If this is set, the event type
830 is set automatically to Node.
831
832 User=<user_name>[,<user_name>,...]
833 Query against users who set the event. If this is set, the
834 event type is set automatically to Node since only user slurm
835 can perform a cluster event.
836
838 Cluster
839 The name of the cluster event happened on.
840
841 ClusterNodes
842 The hostlist of nodes on a cluster in a cluster event.
843
844 Duration
845 Time period the event was around for.
846
847 End Period when event ended.
848
849 Event Name of the event.
850
851 EventRaw
852 Numeric value of the name of the event.
853
854 NodeName
855 The node affected by the event. In a cluster event, this is
856 blank.
857
858 Reason The reason an event happened.
859
860 Start Period when event started.
861
862 State On a node event this is the formatted state of the node during
863 the event.
864
865 StateRaw
866 On a node event this is the numeric value of the state of the
867 node during the event.
868
869 TRES Number of TRES involved with the event.
870
871 User On a node event this is the user who caused the event to happen.
872
874 Clusters[+|-]=<cluster_name>[,<cluster_name>,...]
875 List of clusters to add/remove to a federation. A blank value
876 (e.g. clusters=) will remove all federations for the federation.
877 NOTE: a cluster can only be a member of one federation.
878
879 Name=<name>
880 The name of the federation.
881
882 Tree Display federations in a hierarchical fashion.
883
885 Features
886 The list of features on the cluster.
887
888 Federation
889 The name of the federation.
890
891 Cluster
892 Name of the cluster that is a member of the federation.
893
894 FedState
895 The state of the cluster in the federation.
896
897 FedStateRaw
898 Numeric value of the name of the FedState.
899
900 Index The index of the cluster in the federation.
901
903 Comment=<comment>
904 The job's comment string when the AccountingStoreFlags parameter
905 in the slurm.conf file contains 'job_comment'. The user can
906 only modify the comment string of their own job.
907
908 Cluster=<cluster_list>
909 List of clusters to alter jobs on, defaults to local cluster.
910
911 DerivedExitCode=<derived_exit_code>
912 The derived exit code can be modified after a job completes
913 based on the user's judgment of whether the job succeeded or
914 failed. The user can only modify the derived exit code of their
915 own job.
916
917 EndTime
918 Jobs must end before this time to be modified. Format output is,
919 YYYY-MM-DDTHH:MM:SS, unless changed through the SLURM_TIME_FOR‐
920 MAT environment variable.
921
922 JobID=<jobid_list>
923 The id of the job to change. Not needed if altering multiple
924 jobs using wckey specification.
925
926 NewWCKey=<new_wckey>
927 Use to rename a wckey on job(s) in the accounting database
928
929 StartTime
930 Jobs must start at or after this time to be modified in the same
931 format as EndTime.
932
933 User=<user_list>
934 Used to specify the jobs of users jobs to alter.
935
936 WCKey=<wckey_list>
937 Used to specify the wckeys to alter.
938
939 The DerivedExitCode, Comment and WCKey fields are the only fields of a
940 job record in the database that can be modified after job completion.
941
943 The sacct command is the exclusive command to display job records from
944 the Slurm database.
945
946
948 NOTE: The group limits (GrpJobs, GrpNodes, etc.) are tested when a job
949 is being considered for being allocated resources. If starting a job
950 would cause any of its group limit to be exceeded, that job will not be
951 considered for scheduling even if that job might preempt other jobs
952 which would release sufficient group resources for the pending job to
953 be initiated.
954
955
956 Flags Used by the slurmctld to override or enforce certain character‐
957 istics.
958 Valid options are
959
960 DenyOnLimit
961 If set, jobs using this QOS will be rejected at submis‐
962 sion time if they do not conform to the QOS 'Max' or
963 'Min' limits as stand-alone jobs. Jobs that exceed these
964 limits when other jobs are considered, but conform to the
965 limits when considered individually will not be rejected.
966 Instead they will pend until resources are available.
967 Group limits (e.g. GrpTRES) will also be treated like
968 'Max' limits (e.g. MaxTRESPerNode) and jobs will be de‐
969 nied if they would violate the limit as stand-alone jobs.
970 This currently only applies to QOS and Association lim‐
971 its.
972
973 EnforceUsageThreshold
974 If set, and the QOS also has a UsageThreshold, any jobs
975 submitted with this QOS that fall below the UsageThresh‐
976 old will be held until their Fairshare Usage goes above
977 the Threshold.
978
979 NoDecay
980 If set, this QOS will not have its GrpTRESMins, GrpWall
981 and UsageRaw decayed by the slurm.conf PriorityDecay‐
982 HalfLife or PriorityUsageResetPeriod settings. This al‐
983 lows a QOS to provide aggregate limits that, once con‐
984 sumed, will not be replenished automatically. Such a QOS
985 will act as a time-limited quota of resources for an as‐
986 sociation that has access to it. Account/user usage will
987 still be decayed for associations using the QOS. The QOS
988 GrpTRESMins and GrpWall limits can be increased or the
989 QOS RawUsage value reset to 0 (zero) to again allow jobs
990 submitted with this QOS to be queued (if DenyOnLimit is
991 set) or run (pending with QOSGrp{TRES}MinutesLimit or
992 QOSGrpWallLimit reasons, where {TRES} is some type of
993 trackable resource).
994
995 NoReserve
996 If this flag is set and backfill scheduling is used, jobs
997 using this QOS will not reserve resources in the backfill
998 schedule's map of resources allocated through time. This
999 flag is intended for use with a QOS that may be preempted
1000 by jobs associated with all other QOS (e.g use with a
1001 "standby" QOS). If this flag is used with a QOS which can
1002 not be preempted by all other QOS, it could result in
1003 starvation of larger jobs.
1004
1005 PartitionMaxNodes
1006 If set jobs using this QOS will be able to override the
1007 requested partition's MaxNodes limit.
1008
1009 PartitionMinNodes
1010 If set jobs using this QOS will be able to override the
1011 requested partition's MinNodes limit.
1012
1013 OverPartQOS
1014 If set jobs using this QOS will be able to override any
1015 limits used by the requested partition's QOS limits.
1016
1017 PartitionTimeLimit
1018 If set jobs using this QOS will be able to override the
1019 requested partition's TimeLimit.
1020
1021 RequiresReservation
1022 If set jobs using this QOS must designate a reservation
1023 when submitting a job. This option can be useful in re‐
1024 stricting usage of a QOS that may have greater preemptive
1025 capability or additional resources to be allowed only
1026 within a reservation.
1027
1028 UsageFactorSafe
1029 If set, and AccountingStorageEnforce includes Safe, jobs
1030 will only be able to run if the job can run to completion
1031 with the UsageFactor applied.
1032
1033 GraceTime
1034 Preemption grace time, in seconds, to be extended to a job which
1035 has been selected for preemption.
1036
1037 GrpTRESMins
1038 The total number of TRES minutes that can possibly be used by
1039 past, present and future jobs running from this QOS.
1040
1041 GrpTRESRunMins
1042 Used to limit the combined total number of TRES minutes used by
1043 all jobs running with this QOS. This takes into consideration
1044 time limit of running jobs and consumes it, if the limit is
1045 reached no new jobs are started until other jobs finish to allow
1046 time to free up.
1047
1048 GrpTRES
1049 Maximum number of TRES running jobs are able to be allocated in
1050 aggregate for this QOS.
1051
1052 GrpJobs
1053 Maximum number of running jobs in aggregate for this QOS.
1054
1055 GrpJobsAccrue
1056 Maximum number of pending jobs in aggregate able to accrue age
1057 priority for this QOS. This limit only applies to the job's QOS
1058 and not the partition's QOS.
1059
1060 GrpSubmitJobs
1061 Maximum number of jobs which can be in a pending or running
1062 state at any time in aggregate for this QOS.
1063
1064 NOTE: This setting shows up in the sacctmgr output as GrpSubmit.
1065
1066 GrpWall
1067 Maximum wall clock time running jobs are able to be allocated in
1068 aggregate for this QOS. If this limit is reached submission re‐
1069 quests will be denied and the running jobs will be killed.
1070
1071 ID The id of the QOS.
1072
1073 MaxJobsAccruePerAccount
1074 Maximum number of pending jobs an account (or subacct) can have
1075 accruing age priority at any given time. This limit only ap‐
1076 plies to the job's QOS and not the partition's QOS.
1077
1078 MaxJobsAccruePerUser
1079 Maximum number of pending jobs a user can have accruing age pri‐
1080 ority at any given time. This limit only applies to the job's
1081 QOS and not the partition's QOS.
1082
1083 MaxJobsPerAccount
1084 Maximum number of jobs each account is allowed to run at one
1085 time.
1086
1087 MaxJobsPerUser
1088 Maximum number of jobs each user is allowed to run at one time.
1089
1090 MaxSubmitJobsPerAccount
1091 Maximum number of jobs pending or running state at any time per
1092 account.
1093
1094 MaxSubmitJobsPerUser
1095 Maximum number of jobs pending or running state at any time per
1096 user.
1097
1098 MaxTRESMinsPerJob
1099 Maximum number of TRES minutes each job is able to use.
1100
1101 NOTE: This setting shows up in the sacctmgr output as Max‐
1102 TRESMins.
1103
1104 MaxTRESPerAccount
1105 Maximum number of TRES each account is able to use.
1106
1107 MaxTRESPerJob
1108 Maximum number of TRES each job is able to use.
1109
1110 NOTE: This setting shows up in the sacctmgr output as MaxTRES.
1111
1112 MaxTRESPerNode
1113 Maximum number of TRES each node in a job allocation can use.
1114
1115 MaxTRESPerUser
1116 Maximum number of TRES each user is able to use.
1117
1118 MaxWallDurationPerJob
1119 Maximum wall clock time each job is able to use.
1120
1121 NOTE: This setting shows up in the sacctmgr output as MaxWall.
1122
1123 MinPrioThreshold
1124 Minimum priority required to reserve resources when scheduling.
1125
1126 MinTRESPerJob
1127 Minimum number of TRES each job running under this QOS must re‐
1128 quest. Otherwise the job will pend until modified.
1129
1130 NOTE: This setting shows up in the sacctmgr output as MinTRES.
1131
1132 Name Name of the QOS.
1133
1134 Preempt
1135 Other QOS' this QOS can preempt.
1136
1137 NOTE: The Priority of a QOS is NOT related to QOS preemption,
1138 only Preempt is used to define which QOS can preempt others.
1139
1140 PreemptExemptTime
1141 Specifies a minimum run time for jobs of this QOS before they
1142 are considered for preemption. This QOS option takes precedence
1143 over the global PreemptExemptTime. This is only honored for Pre‐
1144 emptMode=REQUEUE and PreemptMode=CANCEL.
1145 Setting to -1 disables the option, allowing another QOS or the
1146 global option to take effect. Setting to 0 indicates no minimum
1147 run time and supersedes the lower priority QOS (see OverPartQOS)
1148 and/or the global option in slurm.conf.
1149
1150 PreemptMode
1151 Mechanism used to preempt jobs or enable gang scheduling for
1152 this QOS when the cluster PreemptType is set to preempt/qos.
1153 This QOS-specific PreemptMode will override the cluster-wide
1154 PreemptMode for this QOS. Unsetting the QOS specific Preempt‐
1155 Mode, by specifying "OFF", "" or "Cluster", makes it use the de‐
1156 fault cluster-wide PreemptMode.
1157 The GANG option is used to enable gang scheduling independent of
1158 whether preemption is enabled (i.e. independent of the Preempt‐
1159 Type setting). It can be specified in addition to a PreemptMode
1160 setting with the two options comma separated (e.g. Preempt‐
1161 Mode=SUSPEND,GANG).
1162 See <https://slurm.schedmd.com/preempt.html> and
1163 <https://slurm.schedmd.com/gang_scheduling.html> for more de‐
1164 tails.
1165
1166 NOTE: For performance reasons, the backfill scheduler reserves
1167 whole nodes for jobs, not partial nodes. If during backfill
1168 scheduling a job preempts one or more other jobs, the whole
1169 nodes for those preempted jobs are reserved for the preemptor
1170 job, even if the preemptor job requested fewer resources than
1171 that. These reserved nodes aren't available to other jobs dur‐
1172 ing that backfill cycle, even if the other jobs could fit on the
1173 nodes. Therefore, jobs may preempt more resources during a sin‐
1174 gle backfill iteration than they requested.
1175 NOTE: For heterogeneous job to be considered for preemption all
1176 components must be eligible for preemption. When a heterogeneous
1177 job is to be preempted the first identified component of the job
1178 with the highest order PreemptMode (SUSPEND (highest), REQUEUE,
1179 CANCEL (lowest)) will be used to set the PreemptMode for all
1180 components. The GraceTime and user warning signal for each com‐
1181 ponent of the heterogeneous job remain unique. Heterogeneous
1182 jobs are excluded from GANG scheduling operations.
1183
1184 OFF Is the default value and disables job preemption and
1185 gang scheduling. It is only compatible with Pre‐
1186 emptType=preempt/none at a global level.
1187
1188 CANCEL The preempted job will be cancelled.
1189
1190 GANG Enables gang scheduling (time slicing) of jobs in
1191 the same partition, and allows the resuming of sus‐
1192 pended jobs. Gang scheduling is performed indepen‐
1193 dently for each partition, so if you only want
1194 time-slicing by OverSubscribe, without any preemp‐
1195 tion, then configuring partitions with overlapping
1196 nodes is not recommended. Time-slicing won't happen
1197 between jobs on different partitions.
1198
1199 NOTE: Heterogeneous jobs are excluded from GANG
1200 scheduling operations.
1201
1202 REQUEUE Preempts jobs by requeuing them (if possible) or
1203 canceling them. For jobs to be requeued they must
1204 have the --requeue sbatch option set or the cluster
1205 wide JobRequeue parameter in slurm.conf must be set
1206 to 1.
1207
1208 SUSPEND The preempted jobs will be suspended, and later the
1209 Gang scheduler will resume them. Therefore the SUS‐
1210 PEND preemption mode always needs the GANG option to
1211 be specified at the cluster level. Also, because the
1212 suspended jobs will still use memory on the allo‐
1213 cated nodes, Slurm needs to be able to track memory
1214 resources to be able to suspend jobs.
1215 If PreemptType=preempt/qos is configured and if the
1216 preempted job(s) and the preemptor job are on the
1217 same partition, then they will share resources with
1218 the Gang scheduler (time-slicing). If not (i.e. if
1219 the preemptees and preemptor are on different parti‐
1220 tions) then the preempted jobs will remain suspended
1221 until the preemptor ends.
1222
1223 NOTE: Suspended jobs will not release GRES. Higher
1224 priority jobs will not be able to preempt to gain
1225 access to GRES.
1226
1227 Priority
1228 What priority will be added to a job's priority when using this
1229 QOS.
1230
1231 NOTE: The Priority of a QOS is NOT related to QOS preemption,
1232 see Preempt instead.
1233
1234 RawUsage=<value>
1235 This allows an administrator to reset the raw usage accrued to a
1236 QOS. The only value currently supported is 0 (zero). This is a
1237 settable specification only - it cannot be used as a filter to
1238 list accounts.
1239
1240 UsageFactor
1241 Usage factor when running with this QOS. See below for more de‐
1242 tails.
1243
1244 LimitFactor
1245 Factor to scale TRES count limits when running with this QOS.
1246 See below for more details.
1247
1248 UsageThreshold
1249 A float representing the lowest fairshare of an association al‐
1250 lowable to run a job. If an association falls below this
1251 threshold and has pending jobs or submits new jobs those jobs
1252 will be held until the usage goes back above the threshold. Use
1253 sshare to see current shares on the system.
1254
1255 WithDeleted
1256 Display information with previously deleted data.
1257
1259 Description
1260 An arbitrary string describing a QOS.
1261
1262 GraceTime
1263 Preemption grace time to be extended to a job which has been se‐
1264 lected for preemption in the format of hh:mm:ss. The default
1265 value is zero, no preemption grace time is allowed on this QOS.
1266 This value is only meaningful for QOS PreemptMode=CANCEL and
1267 PreemptMode=REQUEUE.
1268
1269 GrpTRESMins
1270 The total number of TRES minutes that can possibly be used by
1271 past, present and future jobs running from this QOS. To clear a
1272 previously set value use the modify command with a new value of
1273 -1 for each TRES id. NOTE: This limit only applies when using
1274 the Priority Multifactor plugin. The time is decayed using the
1275 value of PriorityDecayHalfLife or PriorityUsageResetPeriod as
1276 set in the slurm.conf. When this limit is reached all associ‐
1277 ated jobs running will be killed and all future jobs submitted
1278 with this QOS will be delayed until they are able to run inside
1279 the limit.
1280
1281 GrpTRES
1282 Maximum number of TRES running jobs are able to be allocated in
1283 aggregate for this QOS. To clear a previously set value use the
1284 modify command with a new value of -1 for each TRES id.
1285
1286 GrpJobs
1287 Maximum number of running jobs in aggregate for this QOS. To
1288 clear a previously set value use the modify command with a new
1289 value of -1.
1290
1291 GrpJobsAccrue
1292 Maximum number of pending jobs in aggregate able to accrue age
1293 priority for this QOS. This limit only applies to the job's QOS
1294 and not the partition's QOS. To clear a previously set value
1295 use the modify command with a new value of -1.
1296
1297 GrpSubmitJobs
1298 Maximum number of jobs which can be in a pending or running
1299 state at any time in aggregate for this QOS. To clear a previ‐
1300 ously set value use the modify command with a new value of -1.
1301
1302 NOTE: This setting shows up in the sacctmgr output as GrpSubmit.
1303
1304 GrpWall
1305 Maximum wall clock time running jobs are able to be allocated in
1306 aggregate for this QOS. To clear a previously set value use the
1307 modify command with a new value of -1. NOTE: This limit only
1308 applies when using the Priority Multifactor plugin. The time is
1309 decayed using the value of PriorityDecayHalfLife or Priori‐
1310 tyUsageResetPeriod as set in the slurm.conf. When this limit is
1311 reached all associated jobs running will be killed and all fu‐
1312 ture jobs submitted with this QOS will be delayed until they are
1313 able to run inside the limit.
1314
1315 MaxJobsAccruePerAccount
1316 Maximum number of jobs an account (or subacct) can have accruing
1317 age priority at any given time. This limit only applies to the
1318 job's QOS and not the partition's QOS.
1319
1320 MaxJobsAccruePerUser
1321 Maximum number of jobs a user can have accruing age priority at
1322 any given time. This limit only applies to the job's QOS and not
1323 the partition's QOS.
1324
1325 MaxTRESMinsPerJob
1326 Maximum number of TRES minutes each job is able to use. To
1327 clear a previously set value use the modify command with a new
1328 value of -1 for each TRES id.
1329
1330 NOTE: This setting shows up in the sacctmgr output as Max‐
1331 TRESMins.
1332
1333 MaxTRESPerAccount
1334 Maximum number of TRES each account is able to use. To clear a
1335 previously set value use the modify command with a new value of
1336 -1 for each TRES id.
1337
1338 MaxTRESPerJob
1339 Maximum number of TRES each job is able to use. To clear a pre‐
1340 viously set value use the modify command with a new value of -1
1341 for each TRES id.
1342
1343 NOTE: This setting shows up in the sacctmgr output as MaxTRES.
1344
1345 MaxTRESPerNode
1346 Maximum number of TRES each node in a job allocation can use.
1347 To clear a previously set value use the modify command with a
1348 new value of -1 for each TRES id.
1349
1350 MaxTRESPerUser
1351 Maximum number of TRES each user is able to use. To clear a
1352 previously set value use the modify command with a new value of
1353 -1 for each TRES id.
1354
1355 MaxJobsPerAccount
1356 Maximum number of jobs each account is allowed to run at one
1357 time. To clear a previously set value use the modify command
1358 with a new value of -1.
1359
1360 MaxJobsPerUser
1361 Maximum number of jobs each user is allowed to run at one time.
1362 To clear a previously set value use the modify command with a
1363 new value of -1.
1364
1365 MaxSubmitJobsPerAccount
1366 Maximum number of jobs pending or running state at any time per
1367 account. To clear a previously set value use the modify command
1368 with a new value of -1.
1369
1370 MaxSubmitJobsPerUser
1371 Maximum number of jobs pending or running state at any time per
1372 user. To clear a previously set value use the modify command
1373 with a new value of -1.
1374
1375 MaxWallDurationPerJob
1376 Maximum wall clock time each job is able to use. <max wall>
1377 format is <min> or <min>:<sec> or <hr>:<min>:<sec> or
1378 <days>-<hr>:<min>:<sec> or <days>-<hr>. The value is recorded
1379 in minutes with rounding as needed. To clear a previously set
1380 value use the modify command with a new value of -1.
1381
1382 NOTE: This setting shows up in the sacctmgr output as MaxWall.
1383
1384 MinPrioThreshold
1385 Minimum priority required to reserve resources when scheduling.
1386 To clear a previously set value use the modify command with a
1387 new value of -1.
1388
1389 MinTRES
1390 Minimum number of TRES each job running under this QOS must re‐
1391 quest. Otherwise the job will pend until modified. To clear a
1392 previously set value use the modify command with a new value of
1393 -1 for each TRES id.
1394
1395 Name Name of the QOS. Needed for creation.
1396
1397 Preempt
1398 Other QOS' this QOS can preempt. Setting a Preempt to '' (two
1399 single quotes with nothing between them) restores its default
1400 setting. You can also use the operator += and -= to add or re‐
1401 move certain QOS's from a QOS list.
1402
1403 PreemptMode
1404 Mechanism used to preempt jobs of this QOS if the clusters Pre‐
1405 emptType is configured to preempt/qos. The default preemption
1406 mechanism is specified by the cluster-wide PreemptMode configu‐
1407 ration parameter. Possible values are "Cluster" (meaning use
1408 cluster default), "Cancel", and "Requeue". This option is not
1409 compatible with PreemptMode=OFF or PreemptMode=SUSPEND (i.e.
1410 preempted jobs must be removed from the resources).
1411
1412 Priority
1413 What priority will be added to a job's priority when using this
1414 QOS. To clear a previously set value use the modify command
1415 with a new value of -1.
1416
1417 UsageFactor
1418 A float that is factored into a job’s TRES usage (e.g. RawUsage,
1419 TRESMins, TRESRunMins). For example, if the usagefactor was 2,
1420 for every TRESBillingUnit second a job ran it would count for 2.
1421 If the usagefactor was .5, every second would only count for
1422 half of the time. A setting of 0 would add no timed usage from
1423 the job.
1424
1425 The usage factor only applies to the job's QOS and not the par‐
1426 tition QOS.
1427
1428 If the UsageFactorSafe flag is set and AccountingStorageEnforce
1429 includes Safe, jobs will only be able to run if the job can run
1430 to completion with the UsageFactor applied.
1431
1432 If the UsageFactorSafe flag is not set and AccountingStorageEn‐
1433 force includes Safe, a job will be able to be scheduled without
1434 the UsageFactor applied and will be able to run without being
1435 killed due to limits.
1436
1437 If the UsageFactorSafe flag is not set and AccountingStorageEn‐
1438 force does not include Safe, a job will be able to be scheduled
1439 without the UsageFactor applied and could be killed due to lim‐
1440 its.
1441
1442 See AccountingStorageEnforce in slurm.conf man page.
1443
1444 Default is 1. To clear a previously set value use the modify
1445 command with a new value of -1.
1446
1447 LimitFactor
1448 A float that is factored into an associations [Grp|Max]TRES lim‐
1449 its. For example, if the LimitFactor is 2, then an association
1450 with a GrpTRES of 30 CPUs, would be allowed to allocate 60 CPUs
1451 when running under this QOS.
1452
1453 NOTE: This factor is only applied to associations running in
1454 this QOS and is not applied to any limits in the QOS itself.
1455
1456 To clear a previously set value use the modify command with a
1457 new value of -1.
1458
1460 Clusters=<cluster_name>[,<cluster_name>,...]
1461 List the reservations of the cluster(s). Default is the cluster
1462 where the command was run.
1463
1464 End=<OPT>
1465 Period ending of reservations. Default is now.
1466
1467 Valid time formats are...
1468 HH:MM[:SS] [AM|PM]
1469 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
1470 MM/DD[/YY]-HH:MM[:SS]
1471 YYYY-MM-DD[THH:MM[:SS]]
1472 now[{+|-}count[seconds(default)|minutes|hours|days|weeks]]
1473
1474 ID=<OPT>
1475 Comma separated list of reservation ids.
1476
1477 Names=<OPT>
1478 Comma separated list of reservation names.
1479
1480 Nodes=<node_name>[,<node_name>,...]
1481 Node names where reservation ran.
1482
1483 Start=<OPT>
1484 Period start of reservations. Default is 00:00:00 of current
1485 day.
1486
1487 Valid time formats are...
1488 HH:MM[:SS] [AM|PM]
1489 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
1490 MM/DD[/YY]-HH:MM[:SS]
1491 YYYY-MM-DD[THH:MM[:SS]]
1492 now[{+|-}count[seconds(default)|minutes|hours|days|weeks]]
1493
1495 Associations
1496 The id's of the associations able to run in the reservation.
1497
1498 Cluster
1499 Name of cluster reservation was on.
1500
1501 End End time of reservation.
1502
1503 Flags Flags on the reservation.
1504
1505 ID Reservation ID.
1506
1507 Name Name of this reservation.
1508
1509 NodeNames
1510 List of nodes in the reservation.
1511
1512 Start Start time of reservation.
1513
1514 TRES List of TRES in the reservation.
1515
1516 UnusedWall
1517 Wall clock time in seconds unused by any job. A job's allocated
1518 usage is its run time multiplied by the ratio of its CPUs to the
1519 total number of CPUs in the reservation. For example, a job us‐
1520 ing all the CPUs in the reservation running for 1 minute would
1521 reduce unused_wall by 1 minute.
1522
1524 Clusters=<name_list>
1525 Comma separated list of cluster names on which specified re‐
1526 sources are to be available. If no names are designated then
1527 the clusters already allowed to use this resource will be al‐
1528 tered.
1529
1530 Count=<OPT>
1531 Number of software resources of a specific name configured on
1532 the system being controlled by a resource manager.
1533
1534 Descriptions=
1535 A brief description of the resource.
1536
1537 Flags=<OPT>
1538 Flags that identify specific attributes of the system resource.
1539 At this time no flags have been defined.
1540
1541 ServerType=<OPT>
1542 The type of a software resource manager providing the licenses.
1543 For example FlexNext Publisher Flexlm license server or Reprise
1544 License Manager RLM.
1545
1546 Names=<OPT>
1547 Comma separated list of the name of a resource configured on the
1548 system being controlled by a resource manager. If this resource
1549 is seen on the slurmctld its name will be name@server to distin‐
1550 guish it from local resources defined in a slurm.conf.
1551
1552 PercentAllowed=<percent_allowed>
1553 Percentage of a specific resource that can be used on specified
1554 cluster.
1555
1556 Server=<OPT>
1557 The name of the server serving up the resource. Default is
1558 'slurmdb' indicating the licenses are being served by the data‐
1559 base.
1560
1561 Type=<OPT>
1562 The type of the resource represented by this record. Currently
1563 the only valid type is License.
1564
1565 WithClusters
1566 Display the clusters percentage of resources. If a resource
1567 hasn't been given to a cluster the resource will not be dis‐
1568 played with this flag.
1569
1570 NOTE: Resource is used to define each resource configured on a system
1571 available for usage by Slurm clusters.
1572
1574 Cluster
1575 Name of cluster resource is given to.
1576
1577 Count The count of a specific resource configured on the system glob‐
1578 ally.
1579
1580 Allocated
1581 The percent of licenses allocated to a cluster.
1582
1583 Description
1584 Description of the resource.
1585
1586 ServerType
1587 The type of the server controlling the licenses.
1588
1589 Name Name of this resource.
1590
1591 Server Server serving up the resource.
1592
1593 Type Type of resource this record represents.
1594
1596 Cluster
1597 Name of cluster job ran on.
1598
1599 ID Id of the job.
1600
1601 Name Name of the job.
1602
1603 Partition
1604 Partition job ran on.
1605
1606 State Current State of the job in the database.
1607
1608 TimeStart
1609 Time job started running.
1610
1611 TimeEnd
1612 Current recorded time of the end of the job.
1613
1615 Accounts=<account_name>[,<account_name>,...]
1616 Only print out the transactions affecting specified accounts.
1617
1618 Action=<Specific_action_the_list_will_display>
1619 Only display transactions of the specified action type.
1620
1621 Actor=<Specific_name_the_list_will_display>
1622 Only display transactions done by a certain person.
1623
1624 Clusters=<cluster_name>[,<cluster_name>,...]
1625 Only print out the transactions affecting specified clusters.
1626
1627 End=<Date_and_time_of_last_transaction_to_return>
1628 Return all transactions before this Date and time. Default is
1629 now.
1630
1631 Start=<Date_and_time_of_first_transaction_to_return>
1632 Return all transactions after this Date and time. Default is
1633 epoch.
1634
1635 Valid time formats for End and Start are...
1636 HH:MM[:SS] [AM|PM]
1637 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
1638 MM/DD[/YY]-HH:MM[:SS]
1639 YYYY-MM-DD[THH:MM[:SS]]
1640 now[{+|-}count[seconds(default)|minutes|hours|days|weeks]]
1641
1642 Users=<user_name>[,<user_name>,...]
1643 Only print out the transactions affecting specified users.
1644
1645 WithAssoc
1646 Get information about which associations were affected by the
1647 transactions.
1648
1650 Action Displays the type of Action that took place.
1651
1652 Actor Displays the Actor to generate a transaction.
1653
1654 Info Displays details of the transaction.
1655
1656 TimeStamp
1657 Displays when the transaction occurred.
1658
1659 Where Displays details of the constraints for the transaction.
1660
1661 NOTE: If using the WithAssoc option you can also view the information
1662 about the various associations the transaction affected. The Associa‐
1663 tion format fields are described in the LIST/SHOW ASSOCIATION FORMAT
1664 OPTIONS section.
1665
1667 Account=<account>
1668 Account name to add this user to.
1669
1670 AdminLevel=<level>
1671 Admin level of user. Valid levels are None, Operator, and Ad‐
1672 min.
1673
1674 Cluster=<cluster>
1675 Specific cluster to add user to the account on. Default is all
1676 in system.
1677
1678 DefaultAccount=<account>
1679 Identify the default bank account name to be used for a job if
1680 none is specified at submission time.
1681
1682 DefaultWCKey=<defaultwckey>
1683 Identify the default Workload Characterization Key.
1684
1685 Name=<name>
1686 Name of user.
1687
1688 NewName=<newname>
1689 Use to rename a user in the accounting database
1690
1691 Partition=<name>
1692 Partition name.
1693
1694 RawUsage=<value>
1695 This allows an administrator to reset the raw usage accrued to a
1696 user. The only value currently supported is 0 (zero). This is
1697 a settable specification only - it cannot be used as a filter to
1698 list users.
1699
1700 WCKeys=<wckeys>
1701 Workload Characterization Key values.
1702
1703 WithAssoc
1704 Display all associations for this user.
1705
1706 WithCoord
1707 Display all accounts a user is coordinator for.
1708
1709 WithDeleted
1710 Display information with previously deleted data.
1711
1712 NOTE: If using the WithAssoc option you can also query against associa‐
1713 tion specific information to view only certain associations this user
1714 may have. These extra options can be found in the SPECIFICATIONS FOR
1715 ASSOCIATIONS section. You can also use the general specifications list
1716 above in the GENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES sec‐
1717 tion.
1718
1720 AdminLevel
1721 Admin level of user.
1722
1723 Coordinators
1724 List of users that are a coordinator of the account. (Only
1725 filled in when using the WithCoordinator option.)
1726
1727 DefaultAccount
1728 The user's default account.
1729
1730 DefaultWCKey
1731 The user's default wckey.
1732
1733 User The name of a user.
1734
1735 NOTE: If using the WithAssoc option you can also view the information
1736 about the various associations the user may have on all the clusters in
1737 the system. The association information can be filtered. Note that all
1738 the users in the database will always be shown as filter only takes ef‐
1739 fect over the association data. The Association format fields are de‐
1740 scribed in the LIST/SHOW ASSOCIATION FORMAT OPTIONS section.
1741
1743 Cluster
1744 Specific cluster for the WCKey.
1745
1746 ID The ID of the WCKey.
1747
1748 User The name of a user for the WCKey.
1749
1750 WCKey Workload Characterization Key.
1751
1753 Name The name of the trackable resource. This option is required for
1754 TRES types BB (Burst buffer), GRES, and License. Types CPU, En‐
1755 ergy, Memory, and Node do not have Names. For example if GRES
1756 is the type then name is the denomination of the GRES itself
1757 e.g. GPU.
1758
1759 ID The identification number of the trackable resource as it ap‐
1760 pears in the database.
1761
1762 Type The type of the trackable resource. Current types are BB (Burst
1763 buffer), CPU, Energy, GRES, License, Memory, and Node.
1764
1766 Trackable RESources (TRES) are used in many QOS or Association limits.
1767 When setting the limits they are comma separated list. Each TRES has a
1768 different limit, i.e. GrpTRESMins=cpu=10,mem=20 would make 2 different
1769 limits 1 for 10 cpu minutes and 1 for 20 MB memory minutes. This is
1770 the case for each limit that deals with TRES. To remove the limit -1
1771 is used i.e. GrpTRESMins=cpu=-1 would remove only the cpu TRES limit.
1772
1773 NOTE: When dealing with Memory as a TRES all limits are in MB.
1774
1775 NOTE: The Billing TRES is calculated from a partition's TRESBilling‐
1776 Weights. It is temporarily calculated during scheduling for each parti‐
1777 tion to enforce billing TRES limits. The final Billing TRES is calcu‐
1778 lated after the job has been allocated resources. The final number can
1779 be seen in scontrol show jobs and sacct output.
1780
1781
1783 When using the format option for listing various fields you can put a
1784 %NUMBER afterwards to specify how many characters should be printed.
1785
1786 e.g. format=name%30 will print 30 characters of field name right justi‐
1787 fied. A -30 will print 30 characters left justified.
1788
1789
1791 sacctmgr has the capability to load and dump Slurm association data to
1792 and from a file. This method can easily add a new cluster or copy an
1793 existing cluster's associations into a new cluster with similar ac‐
1794 counts. Each file contains Slurm association data for a single cluster.
1795 Comments can be put into the file with the # character. Each line of
1796 information must begin with one of the four titles; Cluster, Parent,
1797 Account or User. Following the title is a space, dash, space, entity
1798 value, then specifications. Specifications are colon separated. If any
1799 variable, such as an Organization name, has a space in it, surround the
1800 name with single or double quotes.
1801
1802 To create a file of associations you can run
1803 sacctmgr dump tux file=tux.cfg
1804
1805 To load a previously created file you can run
1806 sacctmgr load file=tux.cfg
1807
1808 sacctmgr dump/load must be run as a Slurm administrator or root. If us‐
1809 ing sacctmgr load on a database without any associations, it must be
1810 run as root (because there aren't any users in the database yet).
1811
1812 Other options for load are:
1813 clean - delete what was already there and start from scratch
1814 with this information.
1815 Cluster= - specify a different name for the cluster than that
1816 which is in the file.
1817
1818 Since the associations in the system follow a hierarchy, so does the
1819 file. Anything that is a parent needs to be defined before any chil‐
1820 dren. The only exception is the understood 'root' account. This is
1821 always a default for any cluster and does not need to be defined.
1822
1823 To edit/create a file start with a cluster line for the new cluster:
1824
1825 Cluster - cluster_name:MaxTRESPerJob=node=15
1826
1827 Anything included on this line will be the default for all associations
1828 on this cluster. The options for the cluster are:
1829
1830
1831 GrpTRESMins=
1832 The total number of TRES minutes that can possibly be
1833 used by past, present and future jobs running from this
1834 association and its children.
1835
1836 GrpTRESRunMins=
1837 Used to limit the combined total number of TRES minutes
1838 used by all jobs running with this association and its
1839 children. This takes into consideration time limit of
1840 running jobs and consumes it, if the limit is reached no
1841 new jobs are started until other jobs finish to allow
1842 time to free up.
1843
1844 GrpTRES=
1845 Maximum number of TRES running jobs are able to be allo‐
1846 cated in aggregate for this association and all associa‐
1847 tions which are children of this association.
1848
1849 GrpJobs=
1850 Maximum number of running jobs in aggregate for this as‐
1851 sociation and all associations which are children of this
1852 association.
1853
1854 GrpJobsAccrue=
1855 Maximum number of pending jobs in aggregate able to ac‐
1856 crue age priority for this association and all associa‐
1857 tions which are children of this association.
1858
1859 GrpNodes=
1860 Maximum number of nodes running jobs are able to be allo‐
1861 cated in aggregate for this association and all associa‐
1862 tions which are children of this association.
1863
1864 GrpSubmitJobs=
1865 Maximum number of jobs which can be in a pending or run‐
1866 ning state at any time in aggregate for this association
1867 and all associations which are children of this associa‐
1868 tion.
1869
1870 GrpWall=
1871 Maximum wall clock time running jobs are able to be allo‐
1872 cated in aggregate for this association and all associa‐
1873 tions which are children of this association.
1874
1875 FairShare=
1876 Number used in conjunction with other associations to de‐
1877 termine job priority.
1878
1879 MaxJobs=
1880 Maximum number of jobs the children of this association
1881 can run.
1882
1883 MaxTRESPerJob=
1884 Maximum number of trackable resources per job the chil‐
1885 dren of this association can run.
1886
1887 MaxWallDurationPerJob=
1888 Maximum time (not related to job size) children of this
1889 accounts jobs can run.
1890
1891 QOS= Comma separated list of Quality of Service names (Defined
1892 in sacctmgr).
1893
1894 After the entry for the root account you will have entries for the
1895 other accounts on the system. The entries will look similar to this ex‐
1896 ample:
1897
1898 Parent - root
1899 Account - cs:MaxTRESPerJob=node=5:MaxJobs=4:FairShare=399:MaxWallDurationPerJob=40:Description='Computer Science':Organization='LC'
1900 Parent - cs
1901 Account - test:MaxTRESPerJob=node=1:MaxJobs=1:FairShare=1:MaxWallDurationPerJob=1:Description='Test Account':Organization='Test'
1902
1903 Any of the options after a ':' can be left out and they can be in any
1904 order. If you want to add any sub accounts just list the Parent THAT
1905 HAS ALREADY BEEN CREATED before the account you are adding.
1906
1907 Account options are:
1908
1909 Description=
1910 A brief description of the account.
1911
1912 GrpTRESMins=
1913 Maximum number of TRES hours running jobs are able to be
1914 allocated in aggregate for this association and all asso‐
1915 ciations which are children of this association. Grp‐
1916 TRESRunMins= Used to limit the combined total number of
1917 TRES minutes used by all jobs running with this associa‐
1918 tion and its children. This takes into consideration
1919 time limit of running jobs and consumes it, if the limit
1920 is reached no new jobs are started until other jobs fin‐
1921 ish to allow time to free up.
1922
1923 GrpTRES=
1924 Maximum number of TRES running jobs are able to be allo‐
1925 cated in aggregate for this association and all associa‐
1926 tions which are children of this association.
1927
1928 GrpJobs=
1929 Maximum number of running jobs in aggregate for this as‐
1930 sociation and all associations which are children of this
1931 association.
1932
1933 GrpJobsAccrue
1934 Maximum number of pending jobs in aggregate able to ac‐
1935 crue age priority for this association and all associa‐
1936 tions which are children of this association.
1937
1938 GrpNodes=
1939 Maximum number of nodes running jobs are able to be allo‐
1940 cated in aggregate for this association and all associa‐
1941 tions which are children of this association.
1942
1943 GrpSubmitJobs=
1944 Maximum number of jobs which can be in a pending or run‐
1945 ning state at any time in aggregate for this association
1946 and all associations which are children of this associa‐
1947 tion.
1948
1949 GrpWall=
1950 Maximum wall clock time running jobs are able to be allo‐
1951 cated in aggregate for this association and all associa‐
1952 tions which are children of this association.
1953
1954 FairShare=
1955 Number used in conjunction with other associations to de‐
1956 termine job priority.
1957
1958 MaxJobs=
1959 Maximum number of jobs the children of this association
1960 can run.
1961
1962 MaxNodesPerJob=
1963 Maximum number of nodes per job the children of this as‐
1964 sociation can run.
1965
1966 MaxWallDurationPerJob=
1967 Maximum time (not related to job size) children of this
1968 accounts jobs can run.
1969
1970 Organization=
1971 Name of organization that owns this account.
1972
1973 QOS(=,+=,-=)
1974 Comma separated list of Quality of Service names (Defined
1975 in sacctmgr).
1976
1977
1978 To add users to an account add a line after the Parent line, similar to
1979 this:
1980
1981 Parent - test
1982 User - adam:MaxTRESPerJob=node:2:MaxJobs=3:FairShare=1:MaxWallDurationPerJob=1:AdminLevel=Operator:Coordinator='test'
1983
1984
1985 User options are:
1986
1987 AdminLevel=
1988 Type of admin this user is (Administrator, Operator)
1989 Must be defined on the first occurrence of the user.
1990
1991 Coordinator=
1992 Comma separated list of accounts this user is coordinator
1993 over
1994 Must be defined on the first occurrence of the user.
1995
1996 DefaultAccount=
1997 System wide default account name
1998 Must be defined on the first occurrence of the user.
1999
2000 FairShare=
2001 Number used in conjunction with other associations to de‐
2002 termine job priority.
2003
2004 MaxJobs=
2005 Maximum number of jobs this user can run.
2006
2007 MaxTRESPerJob=
2008 Maximum number of trackable resources per job this user
2009 can run.
2010
2011 MaxWallDurationPerJob=
2012 Maximum time (not related to job size) this user can run.
2013
2014 QOS(=,+=,-=)
2015 Comma separated list of Quality of Service names (Defined
2016 in sacctmgr).
2017
2019 Sacctmgr has the capability to archive to a flatfile and or load that
2020 data if needed later. The archiving is usually done by the slurmdbd
2021 and it is highly recommended you only do it through sacctmgr if you
2022 completely understand what you are doing. For slurmdbd options see
2023 "man slurmdbd" for more information. Loading data into the database
2024 can be done from these files to either view old data or regenerate
2025 rolled up data.
2026
2027
2028 archive dump
2029 Dump accounting data to file. Data will not be archived unless the cor‐
2030 responding purge option is included in this command or in slur‐
2031 mdbd.conf. This operation cannot be rolled back once executed. If one
2032 of the following options is not specified when sacctmgr is called, the
2033 value configured in slurmdbd.conf is used.
2034
2035
2036 Directory=
2037 Directory to store the archive data.
2038
2039 Events Archive Events. If not specified and PurgeEventAfter is set all
2040 event data removed will be lost permanently.
2041
2042 Jobs Archive Jobs. If not specified and PurgeJobAfter is set all job
2043 data removed will be lost permanently.
2044
2045 PurgeEventAfter=
2046 Purge cluster event records older than time stated in months.
2047 If you want to purge on a shorter time period you can include
2048 hours, or days behind the numeric value to get those more fre‐
2049 quent purges. (e.g. a value of '12hours' would purge everything
2050 older than 12 hours.)
2051
2052 PurgeJobAfter=
2053 Purge job records older than time stated in months. If you want
2054 to purge on a shorter time period you can include hours, or days
2055 behind the numeric value to get those more frequent purges.
2056 (e.g. a value of '12hours' would purge everything older than 12
2057 hours.)
2058
2059 PurgeStepAfter=
2060 Purge step records older than time stated in months. If you
2061 want to purge on a shorter time period you can include hours, or
2062 days behind the numeric value to get those more frequent purges.
2063 (e.g. a value of '12hours' would purge everything older than 12
2064 hours.)
2065
2066 PurgeSuspendAfter=
2067 Purge job suspend records older than time stated in months. If
2068 you want to purge on a shorter time period you can include
2069 hours, or days behind the numeric value to get those more fre‐
2070 quent purges. (e.g. a value of '12hours' would purge everything
2071 older than 12 hours.)
2072
2073 Script=
2074 Run this script instead of the generic form of archive to flat
2075 files.
2076
2077 Steps Archive Steps. If not specified and PurgeStepAfter is set all
2078 step data removed will be lost permanently.
2079
2080 Suspend
2081 Archive Suspend Data. If not specified and PurgeSuspendAfter is
2082 set all suspend data removed will be lost permanently.
2083
2084
2085 archive load
2086 Load in to the database previously archived data. The archive file will
2087 not be loaded if the records already exist in the database - therefore,
2088 trying to load an archive file more than once will result in an error.
2089 When this data is again archived and purged from the database, if the
2090 old archive file is still in the directory ArchiveDir, a new archive
2091 file will be created (see ArchiveDir in the slurmdbd.conf man page), so
2092 the old file will not be overwritten and these files will have dupli‐
2093 cate records.
2094
2095
2096 Archive files from the current or any prior Slurm release may be loaded
2097 through archive load.
2098
2099
2100 File= File to load into database. The specified file must exist on the
2101 slurmdbd host, which is not necessarily the machine running the
2102 command.
2103
2104 Insert=
2105 SQL to insert directly into the database. This should be used
2106 very cautiously since this is writing your sql into the data‐
2107 base.
2108
2110 Executing sacctmgr sends a remote procedure call to slurmdbd. If enough
2111 calls from sacctmgr or other Slurm client commands that send remote
2112 procedure calls to the slurmdbd daemon come in at once, it can result
2113 in a degradation of performance of the slurmdbd daemon, possibly re‐
2114 sulting in a denial of service.
2115
2116 Do not run sacctmgr or other Slurm client commands that send remote
2117 procedure calls to slurmdbd from loops in shell scripts or other pro‐
2118 grams. Ensure that programs limit calls to sacctmgr to the minimum
2119 necessary for the information you are trying to gather.
2120
2121
2123 Some sacctmgr options may be set via environment variables. These envi‐
2124 ronment variables, along with their corresponding options, are listed
2125 below. (Note: Command line options will always override these set‐
2126 tings.)
2127
2128
2129 SLURM_CONF The location of the Slurm configuration file.
2130
2132 NOTE: There is an order to set up accounting associations. You must
2133 define clusters before you add accounts and you must add accounts be‐
2134 fore you can add users.
2135
2136 $ sacctmgr create cluster tux
2137 $ sacctmgr create account name=science fairshare=50
2138 $ sacctmgr create account name=chemistry parent=science fairshare=30
2139 $ sacctmgr create account name=physics parent=science fairshare=20
2140 $ sacctmgr create user name=adam cluster=tux account=physics fairshare=10
2141 $ sacctmgr delete user name=adam cluster=tux account=physics
2142 $ sacctmgr delete account name=physics cluster=tux
2143 $ sacctmgr modify user where name=adam cluster=tux account=physics set maxjobs=2 maxwall=30:00
2144 $ sacctmgr add user brian account=chemistry
2145 $ sacctmgr list associations cluster=tux format=Account,Cluster,User,Fairshare tree withd
2146 $ sacctmgr list transactions Action="Add Users" Start=11/03-10:30:00 format=Where,Time
2147 $ sacctmgr dump cluster=tux file=tux_data_file
2148 $ sacctmgr load tux_data_file
2149
2150 A user's account can not be changed directly. A new association needs
2151 to be created for the user with the new account. Then the association
2152 with the old account can be deleted.
2153
2154 When modifying an object placing the key words 'set' and the optional
2155 'where' is critical to perform correctly below are examples to produce
2156 correct results. As a rule of thumb anything you put in front of the
2157 set will be used as a quantifier. If you want to put a quantifier af‐
2158 ter the key word 'set' you should use the key word 'where'. The follow‐
2159 ing is wrong:
2160
2161 $ sacctmgr modify user name=adam set fairshare=10 cluster=tux
2162
2163 This will produce an error as the above line reads modify user adam set
2164 fairshare=10 and cluster=tux. Either of the following is correct:
2165
2166 $ sacctmgr modify user name=adam cluster=tux set fairshare=10
2167 $ sacctmgr modify user name=adam set fairshare=10 where cluster=tux
2168
2169 When changing qos for something only use the '=' operator when wanting
2170 to explicitly set the qos to something. In most cases you will want to
2171 use the '+=' or '-=' operator to either add to or remove from the ex‐
2172 isting qos already in place.
2173
2174 If a user already has qos of normal,standby for a parent or it was ex‐
2175 plicitly set you should use qos+=expedite to add this to the list in
2176 this fashion.
2177
2178 If you are looking to only add the qos expedite to only a certain ac‐
2179 count and or cluster you can do that by specifying them in the sacctmgr
2180 line.
2181
2182 $ sacctmgr modify user name=adam set qos+=expedite
2183
2184 or
2185
2186 $ sacctmgr modify user name=adam acct=this cluster=tux set qos+=expedite
2187
2188 Let's give an example how to add QOS to user accounts. List all avail‐
2189 able QOSs in the cluster.
2190
2191 $ sacctmgr show qos format=name
2192 Name
2193 ---------
2194 normal
2195 expedite
2196
2197 List all the associations in the cluster.
2198
2199 $ sacctmgr show assoc format=cluster,account,qos
2200 Cluster Account QOS
2201 -------- ---------- -----
2202 zebra root normal
2203 zebra root normal
2204 zebra g normal
2205 zebra g1 normal
2206
2207 Add the QOS expedite to account G1 and display the result. Using the
2208 operator += the QOS will be added together with the existing QOS to
2209 this account.
2210
2211 $ sacctmgr modify account name=g1 set qos+=expedite
2212 $ sacctmgr show assoc format=cluster,account,qos
2213 Cluster Account QOS
2214 -------- -------- -------
2215 zebra root normal
2216 zebra root normal
2217 zebra g normal
2218 zebra g1 expedite,normal
2219
2220 Now set the QOS expedite as the only QOS for the account G and display
2221 the result. Using the operator = that expedite is the only usable QOS
2222 by account G
2223
2224 $ sacctmgr modify account name=G set qos=expedite
2225 $ sacctmgr show assoc format=cluster,account,user,qos
2226 Cluster Account QOS
2227 --------- -------- -----
2228 zebra root normal
2229 zebra root normal
2230 zebra g expedite
2231 zebra g1 expedite,normal
2232
2233 If a new account is added under the account G it will inherit the QOS
2234 expedite and it will not have access to QOS normal.
2235
2236 $ sacctmgr add account banana parent=G
2237 $ sacctmgr show assoc format=cluster,account,qos
2238 Cluster Account QOS
2239 --------- -------- -----
2240 zebra root normal
2241 zebra root normal
2242 zebra g expedite
2243 zebra banana expedite
2244 zebra g1 expedite,normal
2245
2246 An example of listing trackable resources:
2247
2248 $ sacctmgr show tres
2249 Type Name ID
2250 ---------- ----------------- --------
2251 cpu 1
2252 mem 2
2253 energy 3
2254 node 4
2255 billing 5
2256 gres gpu:tesla 1001
2257 license vcs 1002
2258 bb cray 1003
2259
2260
2262 Copyright (C) 2008-2010 Lawrence Livermore National Security. Produced
2263 at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
2264 Copyright (C) 2010-2022 SchedMD LLC.
2265
2266 This file is part of Slurm, a resource management program. For de‐
2267 tails, see <https://slurm.schedmd.com/>.
2268
2269 Slurm is free software; you can redistribute it and/or modify it under
2270 the terms of the GNU General Public License as published by the Free
2271 Software Foundation; either version 2 of the License, or (at your op‐
2272 tion) any later version.
2273
2274 Slurm is distributed in the hope that it will be useful, but WITHOUT
2275 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
2276 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
2277 for more details.
2278
2279
2281 slurm.conf(5), slurmdbd(8)
2282
2283
2284
2285April 2022 Slurm Commands sacctmgr(1)