1sacctmgr(1) Slurm Commands sacctmgr(1)
2
3
4
6 sacctmgr - Used to view and modify Slurm account information.
7
8
10 sacctmgr [OPTIONS...] [COMMAND...]
11
12
14 sacctmgr is used to view or modify Slurm account information. The ac‐
15 count information is maintained within a database with the interface
16 being provided by slurmdbd (Slurm Database daemon). This database can
17 serve as a central storehouse of user and computer information for mul‐
18 tiple computers at a single site. Slurm account information is
19 recorded based upon four parameters that form what is referred to as an
20 association. These parameters are user, cluster, partition, and ac‐
21 count. user is the login name. cluster is the name of a Slurm managed
22 cluster as specified by the ClusterName parameter in the slurm.conf
23 configuration file. partition is the name of a Slurm partition on that
24 cluster. account is the bank account for a job. The intended mode of
25 operation is to initiate the sacctmgr command, add, delete, modify,
26 and/or list association records then commit the changes and exit.
27
28 NOTE: The contents of Slurm's database are maintained in lower case.
29 This may result in some sacctmgr output differing from that of other
30 Slurm commands.
31
32
34 -s, --associations
35 Use with show or list to display associations with the entity.
36 This is equivalent to the associations command.
37
38
39 -h, --help
40 Print a help message describing the usage of sacctmgr. This is
41 equivalent to the help command.
42
43
44 -i, --immediate
45 commit changes immediately without asking for confirmation.
46
47
48 -n, --noheader
49 No header will be added to the beginning of the output.
50
51
52 -p, --parsable
53 Output will be '|' delimited with a '|' at the end.
54
55
56 -P, --parsable2
57 Output will be '|' delimited without a '|' at the end.
58
59
60 -Q, --quiet
61 Print no messages other than error messages. This is equivalent
62 to the quiet command.
63
64
65 -r, --readonly
66 Makes it so the running sacctmgr cannot modify accounting infor‐
67 mation. The readonly option is for use within interactive mode.
68
69
70 -v, --verbose
71 Enable detailed logging. This is equivalent to the verbose com‐
72 mand.
73
74
75 -V , --version
76 Display version number. This is equivalent to the version com‐
77 mand.
78
79
81 add <ENTITY> <SPECS>
82 Add an entity. Identical to the create command.
83
84
85 archive {dump|load} <SPECS>
86 Write database information to a flat file or load information
87 that has previously been written to a file.
88
89
90 clear stats
91 Clear the server statistics.
92
93
94 create <ENTITY> <SPECS>
95 Add an entity. Identical to the add command.
96
97
98 delete <ENTITY> where <SPECS>
99 Delete the specified entities. Identical to the remove command.
100
101
102 dump <ENTITY> [File=FILENAME]
103 Dump cluster data to the specified file. If the filename is not
104 specified it uses clustername.cfg filename by default.
105
106
107 help Display a description of sacctmgr options and commands.
108
109
110 list <ENTITY> [<SPECS>]
111 Display information about the specified entity. By default, all
112 entries are displayed, you can narrow results by specifying
113 SPECS in your query. Identical to the show command.
114
115
116 load <FILENAME>
117 Load cluster data from the specified file. This is a configura‐
118 tion file generated by running the sacctmgr dump command. This
119 command does not load archive data, see the sacctmgr archive
120 load option instead.
121
122
123 modify <ENTITY> where <SPECS> set <SPECS>
124 Modify an entity.
125
126
127 reconfigure
128 Reconfigures the SlurmDBD if running with one.
129
130
131 remove <ENTITY> where <SPECS>
132 Delete the specified entities. Identical to the delete command.
133
134
135 show <ENTITY> [<SPECS>]
136 Display information about the specified entity. By default, all
137 entries are displayed, you can narrow results by specifying
138 SPECS in your query. Identical to the list command.
139
140
141 shutdown
142 Shutdown the server.
143
144
145 version
146 Display the version number of sacctmgr.
147
148
150 NOTE: All commands listed below can be used in the interactive mode,
151 but NOT on the initial command line.
152
153
154 exit Terminate sacctmgr interactive mode. Identical to the quit com‐
155 mand.
156
157
158 quiet Print no messages other than error messages.
159
160
161 quit Terminate the execution of sacctmgr interactive mode. Identical
162 to the exit command.
163
164
165 verbose
166 Enable detailed logging. This includes time-stamps on data
167 structures, record counts, etc. This is an independent command
168 with no options meant for use in interactive mode.
169
170
171 !! Repeat the last command.
172
173
175 account
176 A bank account, typically specified at job submit time using the
177 --account= option. These may be arranged in a hierarchical
178 fashion, for example accounts 'chemistry' and 'physics' may be
179 children of the account 'science'. The hierarchy may have an
180 arbitrary depth.
181
182
183 association
184 The entity used to group information consisting of four parame‐
185 ters: account, cluster, partition (optional), and user. Used
186 only with the list or show command. Add, modify, and delete
187 should be done to a user, account or cluster entity. This will
188 in turn update the underlying associations.
189
190
191 cluster
192 The ClusterName parameter in the slurm.conf configuration file,
193 used to differentiate accounts on different machines.
194
195
196 configuration
197 Used only with the list or show command to report current system
198 configuration.
199
200
201 coordinator
202 A special privileged user, usually an account manager, that can
203 add users or sub-accounts to the account they are coordinator
204 over. This should be a trusted person since they can change
205 limits on account and user associations, as well as cancel, re‐
206 queue or reassign accounts of jobs inside their realm.
207
208
209 event Events like downed or draining nodes on clusters.
210
211
212 federation
213 A group of clusters that work together to schedule jobs.
214
215
216 job Used to modify specific fields of a job: Derived Exit Code, the
217 Comment String, or wckey.
218
219
220 problem
221 Use with show or list to display entity problems.
222
223
224 qos Quality of Service.
225
226
227 reservation
228 A collection of resources set apart for use by a particular ac‐
229 count, user or group of users for a given period of time.
230
231
232 resource
233 Software resources for the system. Those are software licenses
234 shared among clusters.
235
236
237 RunawayJobs
238 Used only with the list or show command to report current jobs
239 that have been orphaned on the local cluster and are now run‐
240 away. If there are jobs in this state it will also give you an
241 option to "fix" them. NOTE: You must have an AdminLevel of at
242 least Operator to perform this.
243
244
245 stats Used with list or show command to view server statistics. Ac‐
246 cepts optional argument of ave_time or total_time to sort on
247 those fields. By default, sorts on increasing RPC count field.
248
249
250 transaction
251 List of transactions that have occurred during a given time pe‐
252 riod.
253
254
255 tres Used with list or show command to view a list of Trackable RE‐
256 Sources configured on the system.
257
258
259 user The login name. Usernames are case-insensitive (forced to lower‐
260 case) unless the PreserveCaseUser option has been set in the
261 SlurmDBD configuration file.
262
263
264 wckeys Workload Characterization Key. An arbitrary string for
265 grouping orthogonal accounts.
266
267
269 NOTE: The group limits (GrpJobs, GrpTRES, etc.) are tested when a job
270 is being considered for being allocated resources. If starting a job
271 would cause any of its group limit to be exceeded, that job will not be
272 considered for scheduling even if that job might preempt other jobs
273 which would release sufficient group resources for the pending job to
274 be initiated.
275
276
277 DefaultQOS=<default qos>
278 The default QOS this association and its children should have.
279 This is overridden if set directly on a user. To clear a previ‐
280 ously set value use the modify command with a new value of -1.
281
282
283 Fairshare=<fairshare number | parent>
284 Number used in conjunction with other accounts to determine job
285 priority. Can also be the string parent, when used on a user
286 this means that the parent association is used for fairshare.
287 If Fairshare=parent is set on an account, that account's chil‐
288 dren will be effectively reparented for fairshare calculations
289 to the first parent of their parent that is not Fairshare=par‐
290 ent. Limits remain the same, only its fairshare value is af‐
291 fected. To clear a previously set value use the modify command
292 with a new value of -1.
293
294
295 GrpTRESMins=<TRES=max TRES minutes,...>
296 The total number of TRES minutes that can possibly be used by
297 past, present and future jobs running from this association and
298 its children. To clear a previously set value use the modify
299 command with a new value of -1 for each TRES id.
300
301 NOTE: This limit is not enforced if set on the root association
302 of a cluster. So even though it may appear in sacctmgr output,
303 it will not be enforced.
304
305 ALSO NOTE: This limit only applies when using the Priority Mul‐
306 tifactor plugin. The time is decayed using the value of Priori‐
307 tyDecayHalfLife or PriorityUsageResetPeriod as set in the
308 slurm.conf. When this limit is reached all associated jobs run‐
309 ning will be killed and all future jobs submitted with associa‐
310 tions in the group will be delayed until they are able to run
311 inside the limit.
312
313
314 GrpTRESRunMins=<TRES=max TRES run minutes,...>
315 Used to limit the combined total number of TRES minutes used by
316 all jobs running with this association and its children. This
317 takes into consideration time limit of running jobs and consumes
318 it, if the limit is reached no new jobs are started until other
319 jobs finish to allow time to free up.
320
321
322 GrpTRES=<TRES=max TRES,...>
323 Maximum number of TRES running jobs are able to be allocated in
324 aggregate for this association and all associations which are
325 children of this association. To clear a previously set value
326 use the modify command with a new value of -1 for each TRES id.
327
328 NOTE: This limit only applies fully when using the Select Con‐
329 sumable Resource plugin.
330
331
332 GrpJobs=<max jobs>
333 Maximum number of running jobs in aggregate for this association
334 and all associations which are children of this association. To
335 clear a previously set value use the modify command with a new
336 value of -1.
337
338
339 GrpJobsAccrue=<max jobs>
340 Maximum number of pending jobs in aggregate able to accrue age
341 priority for this association and all associations which are
342 children of this association. To clear a previously set value
343 use the modify command with a new value of -1.
344
345
346 GrpSubmitJobs=<max jobs>
347 Maximum number of jobs which can be in a pending or running
348 state at any time in aggregate for this association and all as‐
349 sociations which are children of this association. To clear a
350 previously set value use the modify command with a new value of
351 -1.
352
353 NOTE: This setting shows up in the sacctmgr output as GrpSubmit.
354
355
356 GrpWall=<max wall>
357 Maximum wall clock time running jobs are able to be allocated in
358 aggregate for this association and all associations which are
359 children of this association. To clear a previously set value
360 use the modify command with a new value of -1.
361
362 NOTE: This limit is not enforced if set on the root association
363 of a cluster. So even though it may appear in sacctmgr output,
364 it will not be enforced.
365
366 ALSO NOTE: This limit only applies when using the Priority Mul‐
367 tifactor plugin. The time is decayed using the value of Priori‐
368 tyDecayHalfLife or PriorityUsageResetPeriod as set in the
369 slurm.conf. When this limit is reached all associated jobs run‐
370 ning will be killed and all future jobs submitted with associa‐
371 tions in the group will be delayed until they are able to run
372 inside the limit.
373
374
375 MaxTRESMinsPerJob=<max TRES minutes>
376 Maximum number of TRES minutes each job is able to use in this
377 association. This is overridden if set directly on a user. De‐
378 fault is the cluster's limit. To clear a previously set value
379 use the modify command with a new value of -1 for each TRES id.
380
381 NOTE: This setting shows up in the sacctmgr output as Max‐
382 TRESMins.
383
384
385 MaxTRESPerJob=<max TRES>
386 Maximum number of TRES each job is able to use in this associa‐
387 tion. This is overridden if set directly on a user. Default is
388 the cluster's limit. To clear a previously set value use the
389 modify command with a new value of -1 for each TRES id.
390
391 NOTE: This setting shows up in the sacctmgr output as MaxTRES.
392
393 NOTE: This limit only applies fully when using cons_res or
394 cons_tres select type plugins.
395
396
397 MaxJobs=<max jobs>
398 Maximum number of jobs each user is allowed to run at one time
399 in this association. This is overridden if set directly on a
400 user. Default is the cluster's limit. To clear a previously
401 set value use the modify command with a new value of -1.
402
403
404 MaxJobsAccrue=<max jobs>
405 Maximum number of pending jobs able to accrue age priority at
406 any given time for the given association. This is overridden if
407 set directly on a user. Default is the cluster's limit. To
408 clear a previously set value use the modify command with a new
409 value of -1.
410
411
412 MaxSubmitJobs=<max jobs>
413 Maximum number of jobs which can this association can have in a
414 pending or running state at any time. Default is the cluster's
415 limit. To clear a previously set value use the modify command
416 with a new value of -1.
417
418 NOTE: This setting shows up in the sacctmgr output as MaxSubmit.
419
420
421 MaxWallDurationPerJob=<max wall>
422 Maximum wall clock time each job is able to use in this associa‐
423 tion. This is overridden if set directly on a user. Default is
424 the cluster's limit. <max wall> format is <min> or <min>:<sec>
425 or <hr>:<min>:<sec> or <days>-<hr>:<min>:<sec> or <days>-<hr>.
426 The value is recorded in minutes with rounding as needed. To
427 clear a previously set value use the modify command with a new
428 value of -1.
429
430 NOTE: Changing this value will have no effect on any running or
431 pending job.
432
433 NOTE: This setting shows up in the sacctmgr output as MaxWall.
434
435
436 Priority
437 What priority will be added to a job's priority when using this
438 association. This is overridden if set directly on a user. De‐
439 fault is the cluster's limit. To clear a previously set value
440 use the modify command with a new value of -1.
441
442
443 QosLevel<operator><comma separated list of qos names>
444 Specify the default Quality of Service's that jobs are able to
445 run at for this association. To get a list of valid QOS's use
446 'sacctmgr list qos'. This value will override its parents value
447 and push down to its children as the new default. Setting a
448 QosLevel to '' (two single quotes with nothing between them) re‐
449 stores its default setting. You can also use the operator +=
450 and -= to add or remove certain QOS's from a QOS list.
451
452 Valid <operator> values include:
453 =
454 Set QosLevel to the specified value. Note: the QOS that can
455 be used at a given account in the hierarchy are inherited
456 by the children of that account. By assigning QOS with the
457 = sign only the assigned QOS can be used by the account and
458 its children.
459 +=
460 Add the specified <qos> value to the current QosLevel.
461 The account will have access to this QOS and the other
462 previously assigned to it.
463 -=
464 Remove the specified <qos> value from the current
465 QosLevel.
466
467
468 See the EXAMPLES section below.
469
470
472 Cluster=<cluster>
473 Specific cluster to add account to. Default is all in system.
474
475
476 Description=<description>
477 An arbitrary string describing an account.
478
479
480 Name=<name>
481 The name of a bank account. Note the name must be unique and
482 can not be represent different bank accounts at different points
483 in the account hierarchy.
484
485
486 Organization=<org>
487 Organization to which the account belongs.
488
489
490 Parent=<parent>
491 Parent account of this account. Default is the root account, a
492 top level account.
493
494
495 RawUsage=<value>
496 This allows an administrator to reset the raw usage accrued to
497 an account. The only value currently supported is 0 (zero).
498 This is a settable specification only - it cannot be used as a
499 filter to list accounts.
500
501
502 WithAssoc
503 Display all associations for this account.
504
505
506 WithCoord
507 Display all coordinators for this account.
508
509
510 WithDeleted
511 Display information with previously deleted data.
512
513 NOTE: If using the WithAssoc option you can also query against associa‐
514 tion specific information to view only certain associations this ac‐
515 count may have. These extra options can be found in the SPECIFICATIONS
516 FOR ASSOCIATIONS section. You can also use the general specifications
517 list above in the GENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES
518 section.
519
520
522 Account
523 The name of a bank account.
524
525
526 Description
527 An arbitrary string describing an account.
528
529
530 Organization
531 Organization to which the account belongs.
532
533
534 Coordinators
535 List of users that are a coordinator of the account. (Only
536 filled in when using the WithCoordinator option.)
537
538 NOTE: If using the WithAssoc option you can also view the information
539 about the various associations the account may have on all the clusters
540 in the system. The association information can be filtered. Note that
541 all the accounts in the database will always be shown as filter only
542 takes effect over the association data. The Association format fields
543 are described in the LIST/SHOW ASSOCIATION FORMAT OPTIONS section.
544
545
546
548 Clusters=<comma separated list of cluster names>
549 List the associations of the cluster(s).
550
551
552 Accounts=<comma separated list of account names>
553 List the associations of the account(s).
554
555
556 Users=<comma separated list of user names>
557 List the associations of the user(s).
558
559
560 Partition=<comma separated list of partition names>
561 List the associations of the partition(s).
562
563 NOTE: You can also use the general specifications list above in the
564 GENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES section.
565
566 Other options unique for listing associations:
567
568
569 OnlyDefaults
570 Display only associations that are default associations
571
572
573 Tree Display account names in a hierarchical fashion.
574
575
576 WithDeleted
577 Display information with previously deleted data.
578
579
580 WithSubAccounts
581 Display information with subaccounts. Only really valuable when
582 used with the account= option. This will display all the subac‐
583 count associations along with the accounts listed in the option.
584
585
586 WOLimits
587 Display information without limit information. This is for a
588 smaller default format of "Cluster,Account,User,Partition".
589
590
591 WOPInfo
592 Display information without parent information (i.e. parent id,
593 and parent account name). This option also implicitly sets the
594 WOPLimits option.
595
596
597 WOPLimits
598 Display information without hierarchical parent limits (i.e.
599 will only display limits where they are set instead of propagat‐
600 ing them from the parent).
601
602
603
605 Account
606 The name of a bank account in the association.
607
608
609 Cluster
610 The name of a cluster in the association.
611
612
613 DefaultQOS
614 The QOS the association will use by default if it as access to
615 it in the QOS list mentioned below.
616
617
618 Fairshare
619 Number used in conjunction with other accounts to determine job
620 priority. Can also be the string parent, when used on a user
621 this means that the parent association is used for fairshare.
622 If Fairshare=parent is set on an account, that account's chil‐
623 dren will be effectively reparented for fairshare calculations
624 to the first parent of their parent that is not Fairshare=par‐
625 ent. Limits remain the same, only its fairshare value is af‐
626 fected.
627
628
629 GrpTRESMins
630 The total number of TRES minutes that can possibly be used by
631 past, present and future jobs running from this association and
632 its children.
633
634
635 GrpTRESRunMins
636 Used to limit the combined total number of TRES minutes used by
637 all jobs running with this association and its children. This
638 takes into consideration time limit of running jobs and consumes
639 it, if the limit is reached no new jobs are started until other
640 jobs finish to allow time to free up.
641
642
643 GrpTRES
644 Maximum number of TRES running jobs are able to be allocated in
645 aggregate for this association and all associations which are
646 children of this association.
647
648
649 GrpJobs
650 Maximum number of running jobs in aggregate for this association
651 and all associations which are children of this association.
652
653
654 GrpJobsAccrue
655 Maximum number of pending jobs in aggregate able to accrue age
656 priority for this association and all associations which are
657 children of this association.
658
659
660 GrpSubmitJobs
661 Maximum number of jobs which can be in a pending or running
662 state at any time in aggregate for this association and all as‐
663 sociations which are children of this association.
664
665 NOTE: This setting shows up in the sacctmgr output as GrpSubmit.
666
667
668 GrpWall
669 Maximum wall clock time running jobs are able to be allocated in
670 aggregate for this association and all associations which are
671 children of this association.
672
673
674 ID The id of the association.
675
676
677 LFT Associations are kept in a hierarchy: this is the left most spot
678 in the hierarchy. When used with the RGT variable, all associa‐
679 tions with a LFT inside this LFT and before the RGT are children
680 of this association.
681
682
683 MaxTRESPerJob
684 Maximum number of TRES each job is able to use.
685
686 NOTE: This setting shows up in the sacctmgr output as MaxTRES.
687
688
689 MaxTRESMinsPerJob
690 Maximum number of TRES minutes each job is able to use.
691
692 NOTE: This setting shows up in the sacctmgr output as Max‐
693 TRESMins.
694
695
696 MaxTRESPerNode
697 Maximum number of TRES each node in a job allocation can use.
698
699
700 MaxJobs
701 Maximum number of jobs each user is allowed to run at one time.
702
703
704 MaxJobsAccrue
705 Maximum number of pending jobs able to accrue age priority at
706 any given time. This limit only applies to the job's QOS and
707 not the partition's QOS.
708
709
710 MaxSubmitJobs
711 Maximum number of jobs pending or running state at any time.
712
713 NOTE: This setting shows up in the sacctmgr output as MaxSubmit.
714
715
716 MaxWallDurationPerJob
717 Maximum wall clock time each job is able to use.
718
719 NOTE: This setting shows up in the sacctmgr output as MaxWall.
720
721
722 Qos
723 Valid QOS' for this association.
724
725
726 QosRaw
727 QOS' ID.
728
729
730 ParentID
731 The association id of the parent of this association.
732
733
734 ParentName
735 The account name of the parent of this association.
736
737
738 Partition
739 The name of a partition in the association.
740
741
742 Priority
743 What priority will be added to a job's priority when using this
744 association.
745
746
747 WithRawQOSLevel
748 Display QosLevel in an unevaluated raw format, consisting of a
749 comma separated list of QOS names prepended with '' (nothing),
750 '+' or '-' for the association. QOS names without +/- prepended
751 were assigned (ie, sacctmgr modify ... set QosLevel=qos_name)
752 for the entity listed or on one of its parents in the hierarchy.
753 QOS names with +/- prepended indicate the QOS was added/filtered
754 (ie, sacctmgr modify ... set QosLevel=[+-]qos_name) for the en‐
755 tity listed or on one of its parents in the hierarchy. Including
756 WOPLimits will show exactly where each QOS was assigned, added
757 or filtered in the hierarchy.
758
759
760 RGT Associations are kept in a hierarchy: this is the right most
761 spot in the hierarchy. When used with the LFT variable, all as‐
762 sociations with a LFT inside this RGT and after the LFT are
763 children of this association.
764
765
766 User The name of a user in the association.
767
768
770 Classification=<classification>
771 Type of machine, current classifications are capability, capac‐
772 ity and capapacity.
773
774
775 Features=<comma separated list of feature names>
776 Features that are specific to the cluster. Federated jobs can be
777 directed to clusters that contain the job requested features.
778
779
780 Federation=<federation>
781 The federation that this cluster should be a member of. A clus‐
782 ter can only be a member of one federation at a time.
783
784
785 FedState=<state>
786 The state of the cluster in the federation.
787 Valid states are:
788
789 ACTIVE Cluster will actively accept and schedule federated jobs.
790
791
792 INACTIVE
793 Cluster will not schedule or accept any jobs.
794
795
796 DRAIN Cluster will not accept any new jobs and will let exist‐
797 ing federated jobs complete.
798
799
800 DRAIN+REMOVE
801 Cluster will not accept any new jobs and will remove it‐
802 self from the federation once all federated jobs have
803 completed. When removed from the federation, the cluster
804 will accept jobs as a non-federated cluster.
805
806
807 Name=<name>
808 The name of a cluster. This should be equal to the ClusterName
809 parameter in the slurm.conf configuration file for some
810 Slurm-managed cluster.
811
812
813 RPC=<rpc list>
814 Comma separated list of numeric RPC values.
815
816
817 WithFed
818 Appends federation related columns to default format options
819 (e.g. Federation,ID,Features,FedState).
820
821
822 WOLimits
823 Display information without limit information. This is for a
824 smaller default format of Cluster,ControlHost,ControlPort,RPC
825
826 NOTE: You can also use the general specifications list above in the
827 GENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES section.
828
829
830
832 Classification
833 Type of machine, i.e. capability, capacity or capapacity.
834
835
836 Cluster
837 The name of the cluster.
838
839
840 ControlHost
841 When a slurmctld registers with the database the ip address of
842 the controller is placed here.
843
844
845 ControlPort
846 When a slurmctld registers with the database the port the con‐
847 troller is listening on is placed here.
848
849
850 Features
851 The list of features on the cluster (if any).
852
853
854 Federation
855 The name of the federation this cluster is a member of (if any).
856
857
858 FedState
859 The state of the cluster in the federation (if a member of one).
860
861
862 FedStateRaw
863 Numeric value of the name of the FedState.
864
865
866 Flags Attributes possessed by the cluster. Current flags include Cray,
867 External and MultipleSlurmd.
868
869 External clusters are registration only clusters. A slurmctld
870 can designate an external slurmdbd with the AccountingStorageEx‐
871 ternalHost slurm.conf option. This allows a slurmctld to regis‐
872 ter to an external slurmdbd so that clusters attached to the ex‐
873 ternal slurmdbd can communicate with the external cluster with
874 Slurm commands.
875
876
877 ID The ID assigned to the cluster when a member of a federation.
878 This ID uniquely identifies the cluster and its jobs in the fed‐
879 eration.
880
881
882 NodeCount
883 The current count of nodes associated with the cluster.
884
885
886 NodeNames
887 The current Nodes associated with the cluster.
888
889
890 PluginIDSelect
891 The numeric value of the select plugin the cluster is using.
892
893
894 RPC When a slurmctld registers with the database the rpc version the
895 controller is running is placed here.
896
897
898 TRES Trackable RESources (Billing, BB (Burst buffer), CPU, Energy,
899 GRES, License, Memory, and Node) this cluster is accounting for.
900
901
902 NOTE: You can also view the information about the root association for
903 the cluster. The Association format fields are described in the
904 LIST/SHOW ASSOCIATION FORMAT OPTIONS section.
905
906
907
909 Account=<comma separated list of account names>
910 Account name to add this user as a coordinator to.
911
912 Names=<comma separated list of user names>
913 Names of coordinators.
914
915 NOTE: To list coordinators use the WithCoordinator options with list
916 account or list user.
917
918
919
921 All_Clusters
922 Get information on all cluster shortcut.
923
924
925 All_Time
926 Get time period for all time shortcut.
927
928
929 Clusters=<comma separated list of cluster names>
930 List the events of the cluster(s). Default is the cluster where
931 the command was run.
932
933
934 End=<OPT>
935 Period ending of events. Default is now.
936
937 Valid time formats are...
938
939 HH:MM[:SS] [AM|PM]
940 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
941 MM/DD[/YY]-HH:MM[:SS]
942 YYYY-MM-DD[THH:MM[:SS]]
943 now[{+|-}count[seconds(default)|minutes|hours|days|weeks]]
944
945
946 Event=<OPT>
947 Specific events to look for, valid options are Cluster or Node,
948 default is both.
949
950
951 MaxTRES=<OPT>
952 Max number of TRES affected by an event.
953
954
955 MinTRES=<OPT>
956 Min number of TRES affected by an event.
957
958
959 Nodes=<comma separated list of node names>
960 Node names affected by an event.
961
962
963 Reason=<comma separated list of reasons>
964 Reason an event happened.
965
966
967 Start=<OPT>
968 Period start of events. Default is 00:00:00 of previous day,
969 unless states are given with the States= spec events. If this
970 is the case the default behavior is to return events currently
971 in the states specified.
972
973 Valid time formats are...
974
975 HH:MM[:SS] [AM|PM]
976 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
977 MM/DD[/YY]-HH:MM[:SS]
978 YYYY-MM-DD[THH:MM[:SS]]
979 now[{+|-}count[seconds(default)|minutes|hours|days|weeks]]
980
981
982 States=<comma separated list of states>
983 State of a node in a node event. If this is set, the event type
984 is set automatically to Node.
985
986
987 User=<comma separated list of users>
988 Query against users who set the event. If this is set, the
989 event type is set automatically to Node since only user slurm
990 can perform a cluster event.
991
992
993
995 Cluster
996 The name of the cluster event happened on.
997
998
999 ClusterNodes
1000 The hostlist of nodes on a cluster in a cluster event.
1001
1002
1003 Duration
1004 Time period the event was around for.
1005
1006
1007 End Period when event ended.
1008
1009
1010 Event Name of the event.
1011
1012
1013 EventRaw
1014 Numeric value of the name of the event.
1015
1016
1017 NodeName
1018 The node affected by the event. In a cluster event, this is
1019 blank.
1020
1021
1022 Reason The reason an event happened.
1023
1024
1025 Start Period when event started.
1026
1027
1028 State On a node event this is the formatted state of the node during
1029 the event.
1030
1031
1032 StateRaw
1033 On a node event this is the numeric value of the state of the
1034 node during the event.
1035
1036
1037 TRES Number of TRES involved with the event.
1038
1039
1040 User On a node event this is the user who caused the event to happen.
1041
1042
1043
1045 Clusters[+|-]=<comma separated list of cluster names>
1046 List of clusters to add/remove to a federation. A blank value
1047 (e.g. clusters=) will remove all federations for the federation.
1048 NOTE: a cluster can only be a member of one federation.
1049
1050
1051 Name=<name>
1052 The name of the federation.
1053
1054
1055 Tree Display federations in a hierarchical fashion.
1056
1057
1059 Features
1060 The list of features on the cluster.
1061
1062
1063 Federation
1064 The name of the federation.
1065
1066
1067 Cluster
1068 Name of the cluster that is a member of the federation.
1069
1070
1071 FedState
1072 The state of the cluster in the federation.
1073
1074
1075 FedStateRaw
1076 Numeric value of the name of the FedState.
1077
1078
1079 Index The index of the cluster in the federation.
1080
1081
1082
1084 Comment=<comment>
1085 The job's comment string when the AccountingStoreFlags parameter
1086 in the slurm.conf file contains 'job_comment'. The user can
1087 only modify the comment string of their own job.
1088
1089
1090 Cluster=<cluster_list>
1091 List of clusters to alter jobs on, defaults to local cluster.
1092
1093
1094 DerivedExitCode=<derived_exit_code>
1095 The derived exit code can be modified after a job completes
1096 based on the user's judgment of whether the job succeeded or
1097 failed. The user can only modify the derived exit code of their
1098 own job.
1099
1100
1101 EndTime
1102 Jobs must end before this time to be modified. Format output is,
1103 YYYY-MM-DDTHH:MM:SS, unless changed through the SLURM_TIME_FOR‐
1104 MAT environment variable.
1105
1106
1107 JobID=<jobid_list>
1108 The id of the job to change. Not needed if altering multiple
1109 jobs using wckey specification.
1110
1111
1112 NewWCKey=<newwckey>
1113 Use to rename a wckey on job(s) in the accounting database
1114
1115
1116 StartTime
1117 Jobs must start at or after this time to be modified in the same
1118 format as EndTime.
1119
1120
1121 User=<user_list>
1122 Used to specify the jobs of users jobs to alter.
1123
1124
1125 WCKey=<wckey_list>
1126 Used to specify the wckeys to alter.
1127
1128
1129 The DerivedExitCode, Comment and WCKey fields are the only
1130 fields of a job record in the database that can be modified af‐
1131 ter job completion.
1132
1133
1135 The sacct command is the exclusive command to display job records from
1136 the Slurm database.
1137
1138
1140 NOTE: The group limits (GrpJobs, GrpNodes, etc.) are tested when a job
1141 is being considered for being allocated resources. If starting a job
1142 would cause any of its group limit to be exceeded, that job will not be
1143 considered for scheduling even if that job might preempt other jobs
1144 which would release sufficient group resources for the pending job to
1145 be initiated.
1146
1147
1148 Flags Used by the slurmctld to override or enforce certain character‐
1149 istics.
1150 Valid options are
1151
1152 DenyOnLimit
1153 If set, jobs using this QOS will be rejected at submis‐
1154 sion time if they do not conform to the QOS 'Max' limits
1155 as stand-alone jobs. Jobs that go over these limits when
1156 other jobs are considered, but conform to the limits when
1157 considered individually will not be rejected. Instead
1158 they will pend until resources are available. Group lim‐
1159 its (e.g. GrpTRES) will also be treated like 'Max' limits
1160 (e.g. MaxTRESPerNode) and jobs will be denied if they
1161 would violate the limit as stand-alone jobs. This cur‐
1162 rently only applies to QOS and Association limits.
1163
1164 EnforceUsageThreshold
1165 If set, and the QOS also has a UsageThreshold, any jobs
1166 submitted with this QOS that fall below the UsageThresh‐
1167 old will be held until their Fairshare Usage goes above
1168 the Threshold.
1169
1170 NoDecay
1171 If set, this QOS will not have its GrpTRESMins, GrpWall
1172 and UsageRaw decayed by the slurm.conf PriorityDecay‐
1173 HalfLife or PriorityUsageResetPeriod settings. This al‐
1174 lows a QOS to provide aggregate limits that, once con‐
1175 sumed, will not be replenished automatically. Such a QOS
1176 will act as a time-limited quota of resources for an as‐
1177 sociation that has access to it. Account/user usage will
1178 still be decayed for associations using the QOS. The QOS
1179 GrpTRESMins and GrpWall limits can be increased or the
1180 QOS RawUsage value reset to 0 (zero) to again allow jobs
1181 submitted with this QOS to be queued (if DenyOnLimit is
1182 set) or run (pending with QOSGrp{TRES}MinutesLimit or
1183 QOSGrpWallLimit reasons, where {TRES} is some type of
1184 trackable resource).
1185
1186 NoReserve
1187 If this flag is set and backfill scheduling is used, jobs
1188 using this QOS will not reserve resources in the backfill
1189 schedule's map of resources allocated through time. This
1190 flag is intended for use with a QOS that may be preempted
1191 by jobs associated with all other QOS (e.g use with a
1192 "standby" QOS). If this flag is used with a QOS which can
1193 not be preempted by all other QOS, it could result in
1194 starvation of larger jobs.
1195
1196 PartitionMaxNodes
1197 If set jobs using this QOS will be able to override the
1198 requested partition's MaxNodes limit.
1199
1200 PartitionMinNodes
1201 If set jobs using this QOS will be able to override the
1202 requested partition's MinNodes limit.
1203
1204 OverPartQOS
1205 If set jobs using this QOS will be able to override any
1206 limits used by the requested partition's QOS limits.
1207
1208 PartitionTimeLimit
1209 If set jobs using this QOS will be able to override the
1210 requested partition's TimeLimit.
1211
1212 RequiresReservation
1213 If set jobs using this QOS must designate a reservation
1214 when submitting a job. This option can be useful in re‐
1215 stricting usage of a QOS that may have greater preemptive
1216 capability or additional resources to be allowed only
1217 within a reservation.
1218
1219 UsageFactorSafe
1220 If set, and AccountingStorageEnforce includes Safe, jobs
1221 will only be able to run if the job can run to completion
1222 with the UsageFactor applied.
1223
1224
1225 GraceTime
1226 Preemption grace time, in seconds, to be extended to a job which
1227 has been selected for preemption.
1228
1229
1230 GrpTRESMins
1231 The total number of TRES minutes that can possibly be used by
1232 past, present and future jobs running from this QOS.
1233
1234
1235 GrpTRESRunMins
1236 Used to limit the combined total number of TRES minutes used by
1237 all jobs running with this QOS. This takes into consideration
1238 time limit of running jobs and consumes it, if the limit is
1239 reached no new jobs are started until other jobs finish to allow
1240 time to free up.
1241
1242
1243 GrpTRES
1244 Maximum number of TRES running jobs are able to be allocated in
1245 aggregate for this QOS.
1246
1247
1248 GrpJobs
1249 Maximum number of running jobs in aggregate for this QOS.
1250
1251
1252 GrpJobsAccrue
1253 Maximum number of pending jobs in aggregate able to accrue age
1254 priority for this QOS. This limit only applies to the job's QOS
1255 and not the partition's QOS.
1256
1257
1258 GrpSubmitJobs
1259 Maximum number of jobs which can be in a pending or running
1260 state at any time in aggregate for this QOS.
1261
1262 NOTE: This setting shows up in the sacctmgr output as GrpSubmit.
1263
1264
1265 GrpWall
1266 Maximum wall clock time running jobs are able to be allocated in
1267 aggregate for this QOS. If this limit is reached submission re‐
1268 quests will be denied and the running jobs will be killed.
1269
1270 ID The id of the QOS.
1271
1272
1273 MaxJobsAccruePerAccount
1274 Maximum number of pending jobs an account (or subacct) can have
1275 accruing age priority at any given time. This limit only ap‐
1276 plies to the job's QOS and not the partition's QOS.
1277
1278
1279 MaxJobsAccruePerUser
1280 Maximum number of pending jobs a user can have accruing age pri‐
1281 ority at any given time. This limit only applies to the job's
1282 QOS and not the partition's QOS.
1283
1284
1285 MaxJobsPerAccount
1286 Maximum number of jobs each account is allowed to run at one
1287 time.
1288
1289
1290 MaxJobsPerUser
1291 Maximum number of jobs each user is allowed to run at one time.
1292
1293
1294 MaxSubmitJobsPerAccount
1295 Maximum number of jobs pending or running state at any time per
1296 account.
1297
1298
1299 MaxSubmitJobsPerUser
1300 Maximum number of jobs pending or running state at any time per
1301 user.
1302
1303
1304 MaxTRESMinsPerJob
1305 Maximum number of TRES minutes each job is able to use.
1306
1307 NOTE: This setting shows up in the sacctmgr output as Max‐
1308 TRESMins.
1309
1310
1311 MaxTRESPerAccount
1312 Maximum number of TRES each account is able to use.
1313
1314
1315 MaxTRESPerJob
1316 Maximum number of TRES each job is able to use.
1317
1318 NOTE: This setting shows up in the sacctmgr output as MaxTRES.
1319
1320
1321 MaxTRESPerNode
1322 Maximum number of TRES each node in a job allocation can use.
1323
1324
1325 MaxTRESPerUser
1326 Maximum number of TRES each user is able to use.
1327
1328
1329 MaxWallDurationPerJob
1330 Maximum wall clock time each job is able to use.
1331
1332 NOTE: This setting shows up in the sacctmgr output as MaxWall.
1333
1334
1335 MinPrioThreshold
1336 Minimum priority required to reserve resources when scheduling.
1337
1338
1339 MinTRESPerJob
1340 Minimum number of TRES each job running under this QOS must re‐
1341 quest. Otherwise the job will pend until modified.
1342
1343 NOTE: This setting shows up in the sacctmgr output as MinTRES.
1344
1345
1346 Name Name of the QOS.
1347
1348
1349 Preempt
1350 Other QOS' this QOS can preempt.
1351
1352 NOTE: The Priority of a QOS is NOT related to QOS preemption,
1353 only Preempt is used to define which QOS can preempt others.
1354
1355
1356 PreemptExemptTime
1357 Specifies a minimum run time for jobs of this QOS before they
1358 are considered for preemption. This QOS option takes precedence
1359 over the global PreemptExemptTime. Setting to -1 disables the
1360 option, allowing another QOS or the global option to take ef‐
1361 fect. Setting to 0 indicates no minimum run time and supersedes
1362 the lower priority QOS (see OverPartQOS) and/or the global op‐
1363 tion in slurm.conf.
1364
1365
1366 PreemptMode
1367 Mechanism used to preempt jobs or enable gang scheduling for
1368 this QOS when the cluster PreemptType is set to preempt/qos.
1369 This QOS-specific PreemptMode will override the cluster-wide
1370 PreemptMode for this QOS. Unsetting the QOS specific Preempt‐
1371 Mode, by specifying "OFF", "" or "Cluster", makes it use the de‐
1372 fault cluster-wide PreemptMode.
1373 See the description of the cluster-wide PreemptMode parameter
1374 for further details of the available modes.
1375
1376
1377 Priority
1378 What priority will be added to a job's priority when using this
1379 QOS.
1380
1381 NOTE: The Priority of a QOS is NOT related to QOS preemption,
1382 see Preempt instead.
1383
1384
1385 RawUsage=<value>
1386 This allows an administrator to reset the raw usage accrued to a
1387 QOS. The only value currently supported is 0 (zero). This is a
1388 settable specification only - it cannot be used as a filter to
1389 list accounts.
1390
1391
1392 UsageFactor
1393 Usage factor when running with this QOS. See below for more de‐
1394 tails.
1395
1396
1397 LimitFactor
1398 Factor to scale TRES count limits when running with this QOS.
1399 See below for more details.
1400
1401
1402 UsageThreshold
1403 A float representing the lowest fairshare of an association al‐
1404 lowable to run a job. If an association falls below this
1405 threshold and has pending jobs or submits new jobs those jobs
1406 will be held until the usage goes back above the threshold. Use
1407 sshare to see current shares on the system.
1408
1409
1410 WithDeleted
1411 Display information with previously deleted data.
1412
1413
1414
1416 Description
1417 An arbitrary string describing a QOS.
1418
1419
1420 GraceTime
1421 Preemption grace time to be extended to a job which has been se‐
1422 lected for preemption in the format of hh:mm:ss. The default
1423 value is zero, no preemption grace time is allowed on this par‐
1424 tition. NOTE: This value is only meaningful for QOS Preempt‐
1425 Mode=CANCEL.
1426
1427
1428 GrpTRESMins
1429 The total number of TRES minutes that can possibly be used by
1430 past, present and future jobs running from this QOS. To clear a
1431 previously set value use the modify command with a new value of
1432 -1 for each TRES id. NOTE: This limit only applies when using
1433 the Priority Multifactor plugin. The time is decayed using the
1434 value of PriorityDecayHalfLife or PriorityUsageResetPeriod as
1435 set in the slurm.conf. When this limit is reached all associ‐
1436 ated jobs running will be killed and all future jobs submitted
1437 with this QOS will be delayed until they are able to run inside
1438 the limit.
1439
1440
1441 GrpTRES
1442 Maximum number of TRES running jobs are able to be allocated in
1443 aggregate for this QOS. To clear a previously set value use the
1444 modify command with a new value of -1 for each TRES id.
1445
1446
1447 GrpJobs
1448 Maximum number of running jobs in aggregate for this QOS. To
1449 clear a previously set value use the modify command with a new
1450 value of -1.
1451
1452
1453 GrpJobsAccrue
1454 Maximum number of pending jobs in aggregate able to accrue age
1455 priority for this QOS. This limit only applies to the job's QOS
1456 and not the partition's QOS. To clear a previously set value
1457 use the modify command with a new value of -1.
1458
1459
1460 GrpSubmitJobs
1461 Maximum number of jobs which can be in a pending or running
1462 state at any time in aggregate for this QOS. To clear a previ‐
1463 ously set value use the modify command with a new value of -1.
1464
1465 NOTE: This setting shows up in the sacctmgr output as GrpSubmit.
1466
1467
1468 GrpWall
1469 Maximum wall clock time running jobs are able to be allocated in
1470 aggregate for this QOS. To clear a previously set value use the
1471 modify command with a new value of -1. NOTE: This limit only
1472 applies when using the Priority Multifactor plugin. The time is
1473 decayed using the value of PriorityDecayHalfLife or Priori‐
1474 tyUsageResetPeriod as set in the slurm.conf. When this limit is
1475 reached all associated jobs running will be killed and all fu‐
1476 ture jobs submitted with this QOS will be delayed until they are
1477 able to run inside the limit.
1478
1479
1480 MaxTRESMinsPerJob
1481 Maximum number of TRES minutes each job is able to use. To
1482 clear a previously set value use the modify command with a new
1483 value of -1 for each TRES id.
1484
1485 NOTE: This setting shows up in the sacctmgr output as Max‐
1486 TRESMins.
1487
1488
1489 MaxTRESPerAccount
1490 Maximum number of TRES each account is able to use. To clear a
1491 previously set value use the modify command with a new value of
1492 -1 for each TRES id.
1493
1494
1495 MaxTRESPerJob
1496 Maximum number of TRES each job is able to use. To clear a pre‐
1497 viously set value use the modify command with a new value of -1
1498 for each TRES id.
1499
1500 NOTE: This setting shows up in the sacctmgr output as MaxTRES.
1501
1502
1503 MaxTRESPerNode
1504 Maximum number of TRES each node in a job allocation can use.
1505 To clear a previously set value use the modify command with a
1506 new value of -1 for each TRES id.
1507
1508
1509 MaxTRESPerUser
1510 Maximum number of TRES each user is able to use. To clear a
1511 previously set value use the modify command with a new value of
1512 -1 for each TRES id.
1513
1514
1515 MaxJobsPerAccount
1516 Maximum number of jobs each account is allowed to run at one
1517 time. To clear a previously set value use the modify command
1518 with a new value of -1.
1519
1520
1521 MaxJobsPerUser
1522 Maximum number of jobs each user is allowed to run at one time.
1523 To clear a previously set value use the modify command with a
1524 new value of -1.
1525
1526
1527 MaxSubmitJobsPerAccount
1528 Maximum number of jobs pending or running state at any time per
1529 account. To clear a previously set value use the modify command
1530 with a new value of -1.
1531
1532
1533 MaxSubmitJobsPerUser
1534 Maximum number of jobs pending or running state at any time per
1535 user. To clear a previously set value use the modify command
1536 with a new value of -1.
1537
1538
1539 MaxWallDurationPerJob
1540 Maximum wall clock time each job is able to use. <max wall>
1541 format is <min> or <min>:<sec> or <hr>:<min>:<sec> or
1542 <days>-<hr>:<min>:<sec> or <days>-<hr>. The value is recorded
1543 in minutes with rounding as needed. To clear a previously set
1544 value use the modify command with a new value of -1.
1545
1546 NOTE: This setting shows up in the sacctmgr output as MaxWall.
1547
1548
1549 MinPrioThreshold
1550 Minimum priority required to reserve resources when scheduling.
1551 To clear a previously set value use the modify command with a
1552 new value of -1.
1553
1554
1555 MinTRES
1556 Minimum number of TRES each job running under this QOS must re‐
1557 quest. Otherwise the job will pend until modified. To clear a
1558 previously set value use the modify command with a new value of
1559 -1 for each TRES id.
1560
1561
1562 Name Name of the QOS. Needed for creation.
1563
1564
1565 Preempt
1566 Other QOS' this QOS can preempt. Setting a Preempt to '' (two
1567 single quotes with nothing between them) restores its default
1568 setting. You can also use the operator += and -= to add or re‐
1569 move certain QOS's from a QOS list.
1570
1571
1572 PreemptMode
1573 Mechanism used to preempt jobs of this QOS if the clusters Pre‐
1574 emptType is configured to preempt/qos. The default preemption
1575 mechanism is specified by the cluster-wide PreemptMode configu‐
1576 ration parameter. Possible values are "Cluster" (meaning use
1577 cluster default), "Cancel", and "Requeue". This option is not
1578 compatible with PreemptMode=OFF or PreemptMode=SUSPEND (i.e.
1579 preempted jobs must be removed from the resources).
1580
1581
1582 Priority
1583 What priority will be added to a job's priority when using this
1584 QOS. To clear a previously set value use the modify command
1585 with a new value of -1.
1586
1587
1588 UsageFactor
1589 A float that is factored into a job’s TRES usage (e.g. RawUsage,
1590 TRESMins, TRESRunMins). For example, if the usagefactor was 2,
1591 for every TRESBillingUnit second a job ran it would count for 2.
1592 If the usagefactor was .5, every second would only count for
1593 half of the time. A setting of 0 would add no timed usage from
1594 the job.
1595
1596 The usage factor only applies to the job's QOS and not the par‐
1597 tition QOS.
1598
1599 If the UsageFactorSafe flag is set and AccountingStorageEnforce
1600 includes Safe, jobs will only be able to run if the job can run
1601 to completion with the UsageFactor applied.
1602
1603 If the UsageFactorSafe flag is not set and AccountingStorageEn‐
1604 force includes Safe, a job will be able to be scheduled without
1605 the UsageFactor applied and will be able to run without being
1606 killed due to limits.
1607
1608 If the UsageFactorSafe flag is not set and AccountingStorageEn‐
1609 force does not include Safe, a job will be able to be scheduled
1610 without the UsageFactor applied and could be killed due to lim‐
1611 its.
1612
1613 See AccountingStorageEnforce in slurm.conf man page.
1614
1615 Default is 1. To clear a previously set value use the modify
1616 command with a new value of -1.
1617
1618
1619 LimitFactor
1620 A float that is factored into an associations [Grp|Max]TRES lim‐
1621 its. For example, if the LimitFactor is 2, then an association
1622 with a GrpTRES of 30 CPUs, would be allowed to allocate 60 CPUs
1623 when running under this QOS.
1624
1625 NOTE: This factor is only applied to associations running in
1626 this QOS and is not applied to any limits in the QOS itself.
1627
1628 To clear a previously set value use the modify command with a
1629 new value of -1.
1630
1631
1633 Clusters=<comma separated list of cluster names>
1634 List the reservations of the cluster(s). Default is the cluster
1635 where the command was run.
1636
1637
1638 End=<OPT>
1639 Period ending of reservations. Default is now.
1640
1641 Valid time formats are...
1642
1643 HH:MM[:SS] [AM|PM]
1644 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
1645 MM/DD[/YY]-HH:MM[:SS]
1646 YYYY-MM-DD[THH:MM[:SS]]
1647 now[{+|-}count[seconds(default)|minutes|hours|days|weeks]]
1648
1649
1650 ID=<OPT>
1651 Comma separated list of reservation ids.
1652
1653
1654 Names=<OPT>
1655 Comma separated list of reservation names.
1656
1657
1658 Nodes=<comma separated list of node names>
1659 Node names where reservation ran.
1660
1661
1662 Start=<OPT>
1663 Period start of reservations. Default is 00:00:00 of current
1664 day.
1665
1666 Valid time formats are...
1667
1668 HH:MM[:SS] [AM|PM]
1669 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
1670 MM/DD[/YY]-HH:MM[:SS]
1671 YYYY-MM-DD[THH:MM[:SS]]
1672 now[{+|-}count[seconds(default)|minutes|hours|days|weeks]]
1673
1674
1676 Associations
1677 The id's of the associations able to run in the reservation.
1678
1679
1680 Cluster
1681 Name of cluster reservation was on.
1682
1683
1684 End End time of reservation.
1685
1686
1687 Flags Flags on the reservation.
1688
1689
1690 ID Reservation ID.
1691
1692
1693 Name Name of this reservation.
1694
1695
1696 NodeNames
1697 List of nodes in the reservation.
1698
1699
1700 Start Start time of reservation.
1701
1702
1703 TRES List of TRES in the reservation.
1704
1705
1706 UnusedWall
1707 Wall clock time in seconds unused by any job. A job's allocated
1708 usage is its run time multiplied by the ratio of its CPUs to the
1709 total number of CPUs in the reservation. For example, a job us‐
1710 ing all the CPUs in the reservation running for 1 minute would
1711 reduce unused_wall by 1 minute.
1712
1713
1714
1716 Clusters=<name list> Comma separated list of cluster names on which
1717 specified resources are to be available. If no names are designated
1718 then the clusters already allowed to use this resource will be altered.
1719
1720
1721 Count=<OPT>
1722 Number of software resources of a specific name configured on
1723 the system being controlled by a resource manager.
1724
1725
1726 Descriptions=
1727 A brief description of the resource.
1728
1729
1730 Flags=<OPT>
1731 Flags that identify specific attributes of the system resource.
1732 At this time no flags have been defined.
1733
1734
1735 ServerType=<OPT>
1736 The type of a software resource manager providing the licenses.
1737 For example FlexNext Publisher Flexlm license server or Reprise
1738 License Manager RLM.
1739
1740
1741 Names=<OPT>
1742 Comma separated list of the name of a resource configured on the
1743 system being controlled by a resource manager. If this resource
1744 is seen on the slurmctld its name will be name@server to distin‐
1745 guish it from local resources defined in a slurm.conf.
1746
1747
1748 PercentAllowed=<percent allowed>
1749 Percentage of a specific resource that can be used on specified
1750 cluster.
1751
1752
1753 Server=<OPT>
1754 The name of the server serving up the resource. Default is
1755 'slurmdb' indicating the licenses are being served by the data‐
1756 base.
1757
1758
1759 Type=<OPT>
1760 The type of the resource represented by this record. Currently
1761 the only valid type is License.
1762
1763
1764 WithClusters
1765 Display the clusters percentage of resources. If a resource
1766 hasn't been given to a cluster the resource will not be dis‐
1767 played with this flag.
1768
1769
1770 NOTE: Resource is used to define each resource configured on a system
1771 available for usage by Slurm clusters.
1772
1773
1775 Cluster
1776 Name of cluster resource is given to.
1777
1778
1779 Count The count of a specific resource configured on the system glob‐
1780 ally.
1781
1782
1783 Allocated
1784 The percent of licenses allocated to a cluster.
1785
1786
1787 Description
1788 Description of the resource.
1789
1790
1791 ServerType
1792 The type of the server controlling the licenses.
1793
1794
1795 Name Name of this resource.
1796
1797
1798 Server Server serving up the resource.
1799
1800
1801 Type Type of resource this record represents.
1802
1803
1805 Cluster
1806 Name of cluster job ran on.
1807
1808
1809 ID Id of the job.
1810
1811
1812 Name Name of the job.
1813
1814
1815 Partition
1816 Partition job ran on.
1817
1818
1819 State Current State of the job in the database.
1820
1821
1822 TimeStart
1823 Time job started running.
1824
1825
1826 TimeEnd
1827 Current recorded time of the end of the job.
1828
1829
1831 Accounts=<comma separated list of account names>
1832 Only print out the transactions affecting specified accounts.
1833
1834
1835 Action=<Specific action the list will display>
1836 Only display transactions of the specified action type.
1837
1838
1839 Actor=<Specific name the list will display>
1840 Only display transactions done by a certain person.
1841
1842
1843 Clusters=<comma separated list of cluster names>
1844 Only print out the transactions affecting specified clusters.
1845
1846
1847 End=<Date and time of last transaction to return>
1848 Return all transactions before this Date and time. Default is
1849 now.
1850
1851
1852 Start=<Date and time of first transaction to return>
1853 Return all transactions after this Date and time. Default is
1854 epoch.
1855
1856 Valid time formats for End and Start are...
1857
1858 HH:MM[:SS] [AM|PM]
1859 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
1860 MM/DD[/YY]-HH:MM[:SS]
1861 YYYY-MM-DD[THH:MM[:SS]]
1862 now[{+|-}count[seconds(default)|minutes|hours|days|weeks]]
1863
1864
1865 Users=<comma separated list of user names>
1866 Only print out the transactions affecting specified users.
1867
1868
1869 WithAssoc
1870 Get information about which associations were affected by the
1871 transactions.
1872
1873
1874
1876 Action Displays the type of Action that took place.
1877
1878
1879 Actor Displays the Actor to generate a transaction.
1880
1881
1882 Info Displays details of the transaction.
1883
1884
1885 TimeStamp
1886 Displays when the transaction occurred.
1887
1888
1889 Where Displays details of the constraints for the transaction.
1890
1891 NOTE: If using the WithAssoc option you can also view the information
1892 about the various associations the transaction affected. The Associa‐
1893 tion format fields are described in the LIST/SHOW ASSOCIATION FORMAT
1894 OPTIONS section.
1895
1896
1897
1899 Account=<account>
1900 Account name to add this user to.
1901
1902
1903 AdminLevel=<level>
1904 Admin level of user. Valid levels are None, Operator, and Ad‐
1905 min.
1906
1907
1908 Cluster=<cluster>
1909 Specific cluster to add user to the account on. Default is all
1910 in system.
1911
1912
1913 DefaultAccount=<account>
1914 Identify the default bank account name to be used for a job if
1915 none is specified at submission time.
1916
1917
1918 DefaultWCKey=<defaultwckey>
1919 Identify the default Workload Characterization Key.
1920
1921
1922 Name=<name>
1923 Name of user.
1924
1925
1926 NewName=<newname>
1927 Use to rename a user in the accounting database
1928
1929
1930 Partition=<name>
1931 Partition name.
1932
1933
1934 RawUsage=<value>
1935 This allows an administrator to reset the raw usage accrued to a
1936 user. The only value currently supported is 0 (zero). This is
1937 a settable specification only - it cannot be used as a filter to
1938 list users.
1939
1940
1941 WCKeys=<wckeys>
1942 Workload Characterization Key values.
1943
1944
1945 WithAssoc
1946 Display all associations for this user.
1947
1948
1949 WithCoord
1950 Display all accounts a user is coordinator for.
1951
1952
1953 WithDeleted
1954 Display information with previously deleted data.
1955
1956 NOTE: If using the WithAssoc option you can also query against associa‐
1957 tion specific information to view only certain associations this user
1958 may have. These extra options can be found in the SPECIFICATIONS FOR
1959 ASSOCIATIONS section. You can also use the general specifications list
1960 above in the GENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES sec‐
1961 tion.
1962
1963
1964
1966 AdminLevel
1967 Admin level of user.
1968
1969
1970 Coordinators
1971 List of users that are a coordinator of the account. (Only
1972 filled in when using the WithCoordinator option.)
1973
1974
1975 DefaultAccount
1976 The user's default account.
1977
1978
1979 DefaultWCKey
1980 The user's default wckey.
1981
1982
1983 User The name of a user.
1984
1985 NOTE: If using the WithAssoc option you can also view the information
1986 about the various associations the user may have on all the clusters in
1987 the system. The association information can be filtered. Note that all
1988 the users in the database will always be shown as filter only takes ef‐
1989 fect over the association data. The Association format fields are de‐
1990 scribed in the LIST/SHOW ASSOCIATION FORMAT OPTIONS section.
1991
1992
1993
1995 Cluster
1996 Specific cluster for the WCKey.
1997
1998
1999 ID The ID of the WCKey.
2000
2001
2002 User The name of a user for the WCKey.
2003
2004
2005 WCKey Workload Characterization Key.
2006
2007
2009 Name The name of the trackable resource. This option is required for
2010 TRES types BB (Burst buffer), GRES, and License. Types CPU, En‐
2011 ergy, Memory, and Node do not have Names. For example if GRES
2012 is the type then name is the denomination of the GRES itself
2013 e.g. GPU.
2014
2015
2016 ID The identification number of the trackable resource as it ap‐
2017 pears in the database.
2018
2019
2020 Type The type of the trackable resource. Current types are BB (Burst
2021 buffer), CPU, Energy, GRES, License, Memory, and Node.
2022
2023
2025 Trackable RESources (TRES) are used in many QOS or Association limits.
2026 When setting the limits they are comma separated list. Each TRES has a
2027 different limit, i.e. GrpTRESMins=cpu=10,mem=20 would make 2 different
2028 limits 1 for 10 cpu minutes and 1 for 20 MB memory minutes. This is
2029 the case for each limit that deals with TRES. To remove the limit -1
2030 is used i.e. GrpTRESMins=cpu=-1 would remove only the cpu TRES limit.
2031
2032 NOTE: When dealing with Memory as a TRES all limits are in MB.
2033
2034 NOTE: The Billing TRES is calculated from a partition's TRESBilling‐
2035 Weights. It is temporarily calculated during scheduling for each parti‐
2036 tion to enforce billing TRES limits. The final Billing TRES is calcu‐
2037 lated after the job has been allocated resources. The final number can
2038 be seen in scontrol show jobs and sacct output.
2039
2040
2042 When using the format option for listing various fields you can put a
2043 %NUMBER afterwards to specify how many characters should be printed.
2044
2045 e.g. format=name%30 will print 30 characters of field name right justi‐
2046 fied. A -30 will print 30 characters left justified.
2047
2048
2050 sacctmgr has the capability to load and dump Slurm association data to
2051 and from a file. This method can easily add a new cluster or copy an
2052 existing cluster's associations into a new cluster with similar ac‐
2053 counts. Each file contains Slurm association data for a single cluster.
2054 Comments can be put into the file with the # character. Each line of
2055 information must begin with one of the four titles; Cluster, Parent,
2056 Account or User. Following the title is a space, dash, space, entity
2057 value, then specifications. Specifications are colon separated. If any
2058 variable, such as an Organization name, has a space in it, surround the
2059 name with single or double quotes.
2060
2061 To create a file of associations you can run
2062 sacctmgr dump tux file=tux.cfg
2063
2064 To load a previously created file you can run
2065 sacctmgr load file=tux.cfg
2066
2067 sacctmgr dump/load must be run as a Slurm administrator or root. If us‐
2068 ing sacctmgr load on a database without any associations, it must be
2069 run as root (because there aren't any users in the database yet).
2070
2071 Other options for load are:
2072 clean - delete what was already there and start from scratch
2073 with this information.
2074 Cluster= - specify a different name for the cluster than that
2075 which is in the file.
2076
2077 Since the associations in the system follow a hierarchy, so does the
2078 file. Anything that is a parent needs to be defined before any chil‐
2079 dren. The only exception is the understood 'root' account. This is
2080 always a default for any cluster and does not need to be defined.
2081
2082 To edit/create a file start with a cluster line for the new cluster:
2083
2084 Cluster - cluster_name:MaxTRESPerJob=node=15
2085
2086 Anything included on this line will be the default for all associations
2087 on this cluster. The options for the cluster are:
2088
2089 GrpTRESMins=
2090 The total number of TRES minutes that can possibly be
2091 used by past, present and future jobs running from this
2092 association and its children.
2093
2094 GrpTRESRunMins=
2095 Used to limit the combined total number of TRES minutes
2096 used by all jobs running with this association and its
2097 children. This takes into consideration time limit of
2098 running jobs and consumes it, if the limit is reached no
2099 new jobs are started until other jobs finish to allow
2100 time to free up.
2101
2102 GrpTRES=
2103 Maximum number of TRES running jobs are able to be allo‐
2104 cated in aggregate for this association and all associa‐
2105 tions which are children of this association.
2106
2107 GrpJobs=
2108 Maximum number of running jobs in aggregate for this as‐
2109 sociation and all associations which are children of this
2110 association.
2111
2112 GrpJobsAccrue=
2113 Maximum number of pending jobs in aggregate able to ac‐
2114 crue age priority for this association and all associa‐
2115 tions which are children of this association.
2116
2117 GrpNodes=
2118 Maximum number of nodes running jobs are able to be allo‐
2119 cated in aggregate for this association and all associa‐
2120 tions which are children of this association.
2121
2122 GrpSubmitJobs=
2123 Maximum number of jobs which can be in a pending or run‐
2124 ning state at any time in aggregate for this association
2125 and all associations which are children of this associa‐
2126 tion.
2127
2128 GrpWall=
2129 Maximum wall clock time running jobs are able to be allo‐
2130 cated in aggregate for this association and all associa‐
2131 tions which are children of this association.
2132
2133 FairShare=
2134 Number used in conjunction with other associations to de‐
2135 termine job priority.
2136
2137 MaxJobs=
2138 Maximum number of jobs the children of this association
2139 can run.
2140
2141 MaxTRESPerJob=
2142 Maximum number of trackable resources per job the chil‐
2143 dren of this association can run.
2144
2145 MaxWallDurationPerJob=
2146 Maximum time (not related to job size) children of this
2147 accounts jobs can run.
2148
2149 QOS=
2150 Comma separated list of Quality of Service names (Defined
2151 in sacctmgr).
2152
2153 After the entry for the root account you will have entries for the
2154 other accounts on the system. The entries will look similar to this ex‐
2155 ample:
2156
2157 Parent - root
2158 Account - cs:MaxTRESPerJob=node=5:MaxJobs=4:FairShare=399:MaxWallDurationPerJob=40:Description='Computer Science':Organization='LC'
2159 Parent - cs
2160 Account - test:MaxTRESPerJob=node=1:MaxJobs=1:FairShare=1:MaxWallDurationPerJob=1:Description='Test Account':Organization='Test'
2161
2162 Any of the options after a ':' can be left out and they can be in any
2163 order. If you want to add any sub accounts just list the Parent THAT
2164 HAS ALREADY BEEN CREATED before the account you are adding.
2165
2166 Account options are:
2167
2168 Description=
2169 A brief description of the account.
2170
2171 GrpTRESMins=
2172 Maximum number of TRES hours running jobs are able to be
2173 allocated in aggregate for this association and all asso‐
2174 ciations which are children of this association. Grp‐
2175 TRESRunMins= Used to limit the combined total number of
2176 TRES minutes used by all jobs running with this associa‐
2177 tion and its children. This takes into consideration
2178 time limit of running jobs and consumes it, if the limit
2179 is reached no new jobs are started until other jobs fin‐
2180 ish to allow time to free up.
2181
2182 GrpTRES=
2183 Maximum number of TRES running jobs are able to be allo‐
2184 cated in aggregate for this association and all associa‐
2185 tions which are children of this association.
2186
2187 GrpJobs=
2188 Maximum number of running jobs in aggregate for this as‐
2189 sociation and all associations which are children of this
2190 association.
2191
2192 GrpJobsAccrue
2193 Maximum number of pending jobs in aggregate able to ac‐
2194 crue age priority for this association and all associa‐
2195 tions which are children of this association.
2196
2197 GrpNodes=
2198 Maximum number of nodes running jobs are able to be allo‐
2199 cated in aggregate for this association and all associa‐
2200 tions which are children of this association.
2201
2202 GrpSubmitJobs=
2203 Maximum number of jobs which can be in a pending or run‐
2204 ning state at any time in aggregate for this association
2205 and all associations which are children of this associa‐
2206 tion.
2207
2208 GrpWall=
2209 Maximum wall clock time running jobs are able to be allo‐
2210 cated in aggregate for this association and all associa‐
2211 tions which are children of this association.
2212
2213 FairShare=
2214 Number used in conjunction with other associations to de‐
2215 termine job priority.
2216
2217 MaxJobs=
2218 Maximum number of jobs the children of this association
2219 can run.
2220
2221 MaxNodesPerJob=
2222 Maximum number of nodes per job the children of this as‐
2223 sociation can run.
2224
2225 MaxWallDurationPerJob=
2226 Maximum time (not related to job size) children of this
2227 accounts jobs can run.
2228
2229 Organization=
2230 Name of organization that owns this account.
2231
2232 QOS(=,+=,-=)
2233 Comma separated list of Quality of Service names (Defined
2234 in sacctmgr).
2235
2236
2237 To add users to an account add a line after the Parent line, similar to
2238 this:
2239
2240 Parent - test
2241 User - adam:MaxTRESPerJob=node:2:MaxJobs=3:FairShare=1:MaxWallDurationPerJob=1:AdminLevel=Operator:Coordinator='test'
2242
2243
2244 User options are:
2245
2246 AdminLevel=
2247 Type of admin this user is (Administrator, Operator)
2248 Must be defined on the first occurrence of the user.
2249
2250 Coordinator=
2251 Comma separated list of accounts this user is coordinator
2252 over
2253 Must be defined on the first occurrence of the user.
2254
2255 DefaultAccount=
2256 System wide default account name
2257 Must be defined on the first occurrence of the user.
2258
2259 FairShare=
2260 Number used in conjunction with other associations to de‐
2261 termine job priority.
2262
2263 MaxJobs=
2264 Maximum number of jobs this user can run.
2265
2266 MaxTRESPerJob=
2267 Maximum number of trackable resources per job this user
2268 can run.
2269
2270 MaxWallDurationPerJob=
2271 Maximum time (not related to job size) this user can run.
2272
2273 QOS(=,+=,-=)
2274 Comma separated list of Quality of Service names (Defined
2275 in sacctmgr).
2276
2277
2279 Sacctmgr has the capability to archive to a flatfile and or load that
2280 data if needed later. The archiving is usually done by the slurmdbd
2281 and it is highly recommended you only do it through sacctmgr if you
2282 completely understand what you are doing. For slurmdbd options see
2283 "man slurmdbd" for more information. Loading data into the database
2284 can be done from these files to either view old data or regenerate
2285 rolled up data.
2286
2287
2288 archive dump
2289 Dump accounting data to file. Data will not be archived unless the cor‐
2290 responding purge option is included in this command or in slur‐
2291 mdbd.conf. This operation cannot be rolled back once executed. If one
2292 of the following options is not specified when sacctmgr is called, the
2293 value configured in slurmdbd.conf is used.
2294
2295
2296 Directory=
2297 Directory to store the archive data.
2298
2299 Events Archive Events. If not specified and PurgeEventAfter is set all
2300 event data removed will be lost permanently.
2301
2302 Jobs Archive Jobs. If not specified and PurgeJobAfter is set all job
2303 data removed will be lost permanently.
2304
2305 PurgeEventAfter=
2306 Purge cluster event records older than time stated in months.
2307 If you want to purge on a shorter time period you can include
2308 hours, or days behind the numeric value to get those more fre‐
2309 quent purges. (e.g. a value of '12hours' would purge everything
2310 older than 12 hours.)
2311
2312 PurgeJobAfter=
2313 Purge job records older than time stated in months. If you want
2314 to purge on a shorter time period you can include hours, or days
2315 behind the numeric value to get those more frequent purges.
2316 (e.g. a value of '12hours' would purge everything older than 12
2317 hours.)
2318
2319 PurgeStepAfter=
2320 Purge step records older than time stated in months. If you
2321 want to purge on a shorter time period you can include hours, or
2322 days behind the numeric value to get those more frequent purges.
2323 (e.g. a value of '12hours' would purge everything older than 12
2324 hours.)
2325
2326 PurgeSuspendAfter=
2327 Purge job suspend records older than time stated in months. If
2328 you want to purge on a shorter time period you can include
2329 hours, or days behind the numeric value to get those more fre‐
2330 quent purges. (e.g. a value of '12hours' would purge everything
2331 older than 12 hours.)
2332
2333 Script=
2334 Run this script instead of the generic form of archive to flat
2335 files.
2336
2337 Steps Archive Steps. If not specified and PurgeStepAfter is set all
2338 step data removed will be lost permanently.
2339
2340 Suspend
2341 Archive Suspend Data. If not specified and PurgeSuspendAfter is
2342 set all suspend data removed will be lost permanently.
2343
2344
2345 archive load
2346 Load in to the database previously archived data. The archive file will
2347 not be loaded if the records already exist in the database - therefore,
2348 trying to load an archive file more than once will result in an error.
2349 When this data is again archived and purged from the database, if the
2350 old archive file is still in the directory ArchiveDir, a new archive
2351 file will be created (see ArchiveDir in the slurmdbd.conf man page), so
2352 the old file will not be overwritten and these files will have dupli‐
2353 cate records.
2354
2355
2356 File= File to load into database. The specified file must exist on the
2357 slurmdbd host, which is not necessarily the machine running the
2358 command.
2359
2360 Insert=
2361 SQL to insert directly into the database. This should be used
2362 very cautiously since this is writing your sql into the data‐
2363 base.
2364
2365
2367 Executing sacctmgr sends a remote procedure call to slurmdbd. If enough
2368 calls from sacctmgr or other Slurm client commands that send remote
2369 procedure calls to the slurmdbd daemon come in at once, it can result
2370 in a degradation of performance of the slurmdbd daemon, possibly re‐
2371 sulting in a denial of service.
2372
2373 Do not run sacctmgr or other Slurm client commands that send remote
2374 procedure calls to slurmdbd from loops in shell scripts or other pro‐
2375 grams. Ensure that programs limit calls to sacctmgr to the minimum
2376 necessary for the information you are trying to gather.
2377
2378
2380 Some sacctmgr options may be set via environment variables. These envi‐
2381 ronment variables, along with their corresponding options, are listed
2382 below. (Note: Command line options will always override these set‐
2383 tings.)
2384
2385 SLURM_CONF The location of the Slurm configuration file.
2386
2387
2389 NOTE: There is an order to set up accounting associations. You must
2390 define clusters before you add accounts and you must add accounts be‐
2391 fore you can add users.
2392
2393 $ sacctmgr create cluster tux
2394 $ sacctmgr create account name=science fairshare=50
2395 $ sacctmgr create account name=chemistry parent=science fairshare=30
2396 $ sacctmgr create account name=physics parent=science fairshare=20
2397 $ sacctmgr create user name=adam cluster=tux account=physics fairshare=10
2398 $ sacctmgr delete user name=adam cluster=tux account=physics
2399 $ sacctmgr delete account name=physics cluster=tux
2400 $ sacctmgr modify user where name=adam cluster=tux account=physics set maxjobs=2 maxwall=30:00
2401 $ sacctmgr add user brian account=chemistry
2402 $ sacctmgr list associations cluster=tux format=Account,Cluster,User,Fairshare tree withd
2403 $ sacctmgr list transactions Action="Add Users" Start=11/03-10:30:00 format=Where,Time
2404 $ sacctmgr dump cluster=tux file=tux_data_file
2405 $ sacctmgr load tux_data_file
2406
2407 A user's account can not be changed directly. A new association needs
2408 to be created for the user with the new account. Then the association
2409 with the old account can be deleted.
2410
2411 When modifying an object placing the key words 'set' and the optional
2412 'where' is critical to perform correctly below are examples to produce
2413 correct results. As a rule of thumb anything you put in front of the
2414 set will be used as a quantifier. If you want to put a quantifier af‐
2415 ter the key word 'set' you should use the key word 'where'. The follow‐
2416 ing is wrong:
2417
2418 $ sacctmgr modify user name=adam set fairshare=10 cluster=tux
2419
2420 This will produce an error as the above line reads modify user adam set
2421 fairshare=10 and cluster=tux. Either of the following is correct:
2422
2423 $ sacctmgr modify user name=adam cluster=tux set fairshare=10
2424 $ sacctmgr modify user name=adam set fairshare=10 where cluster=tux
2425
2426 When changing qos for something only use the '=' operator when wanting
2427 to explicitly set the qos to something. In most cases you will want to
2428 use the '+=' or '-=' operator to either add to or remove from the ex‐
2429 isting qos already in place.
2430
2431 If a user already has qos of normal,standby for a parent or it was ex‐
2432 plicitly set you should use qos+=expedite to add this to the list in
2433 this fashion.
2434
2435 If you are looking to only add the qos expedite to only a certain ac‐
2436 count and or cluster you can do that by specifying them in the sacctmgr
2437 line.
2438
2439 $ sacctmgr modify user name=adam set qos+=expedite
2440
2441 or
2442
2443 $ sacctmgr modify user name=adam acct=this cluster=tux set qos+=expedite
2444
2445 Let's give an example how to add QOS to user accounts. List all avail‐
2446 able QOSs in the cluster.
2447
2448 $ sacctmgr show qos format=name
2449 Name
2450 ---------
2451 normal
2452 expedite
2453
2454 List all the associations in the cluster.
2455
2456 $ sacctmgr show assoc format=cluster,account,qos
2457 Cluster Account QOS
2458 -------- ---------- -----
2459 zebra root normal
2460 zebra root normal
2461 zebra g normal
2462 zebra g1 normal
2463
2464 Add the QOS expedite to account G1 and display the result. Using the
2465 operator += the QOS will be added together with the existing QOS to
2466 this account.
2467
2468 $ sacctmgr modify account name=g1 set qos+=expedite
2469 $ sacctmgr show assoc format=cluster,account,qos
2470 Cluster Account QOS
2471 -------- -------- -------
2472 zebra root normal
2473 zebra root normal
2474 zebra g normal
2475 zebra g1 expedite,normal
2476
2477 Now set the QOS expedite as the only QOS for the account G and display
2478 the result. Using the operator = that expedite is the only usable QOS
2479 by account G
2480
2481 $ sacctmgr modify account name=G set qos=expedite
2482 $ sacctmgr show assoc format=cluster,account,user,qos
2483 Cluster Account QOS
2484 --------- -------- -----
2485 zebra root normal
2486 zebra root normal
2487 zebra g expedite
2488 zebra g1 expedite,normal
2489
2490 If a new account is added under the account G it will inherit the QOS
2491 expedite and it will not have access to QOS normal.
2492
2493 $ sacctmgr add account banana parent=G
2494 $ sacctmgr show assoc format=cluster,account,qos
2495 Cluster Account QOS
2496 --------- -------- -----
2497 zebra root normal
2498 zebra root normal
2499 zebra g expedite
2500 zebra banana expedite
2501 zebra g1 expedite,normal
2502
2503 An example of listing trackable resources:
2504
2505 $ sacctmgr show tres
2506 Type Name ID
2507 ---------- ----------------- --------
2508 cpu 1
2509 mem 2
2510 energy 3
2511 node 4
2512 billing 5
2513 gres gpu:tesla 1001
2514 license vcs 1002
2515 bb cray 1003
2516
2517
2519 Copyright (C) 2008-2010 Lawrence Livermore National Security. Produced
2520 at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
2521 Copyright (C) 2010-2021 SchedMD LLC.
2522
2523 This file is part of Slurm, a resource management program. For de‐
2524 tails, see <https://slurm.schedmd.com/>.
2525
2526 Slurm is free software; you can redistribute it and/or modify it under
2527 the terms of the GNU General Public License as published by the Free
2528 Software Foundation; either version 2 of the License, or (at your op‐
2529 tion) any later version.
2530
2531 Slurm is distributed in the hope that it will be useful, but WITHOUT
2532 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
2533 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
2534 for more details.
2535
2536
2538 slurm.conf(5), slurmdbd(8)
2539
2540
2541
2542October 2021 Slurm Commands sacctmgr(1)