1sacctmgr(1) Slurm Commands sacctmgr(1)
2
3
4
6 sacctmgr - Used to view and modify Slurm account information.
7
8
10 sacctmgr [OPTIONS...] [COMMAND...]
11
12
14 sacctmgr is used to view or modify Slurm account information. The
15 account information is maintained within a database with the interface
16 being provided by slurmdbd (Slurm Database daemon). This database can
17 serve as a central storehouse of user and computer information for mul‐
18 tiple computers at a single site. Slurm account information is
19 recorded based upon four parameters that form what is referred to as an
20 association. These parameters are user, cluster, partition, and
21 account. user is the login name. cluster is the name of a Slurm man‐
22 aged cluster as specified by the ClusterName parameter in the
23 slurm.conf configuration file. partition is the name of a Slurm parti‐
24 tion on that cluster. account is the bank account for a job. The
25 intended mode of operation is to initiate the sacctmgr command, add,
26 delete, modify, and/or list association records then commit the changes
27 and exit.
28
29
30 Note: The content's of Slurm's database are maintained in lower case.
31 This may result in some sacctmgr output differing from that of
32 other Slurm commands.
33
34
36 -h, --help
37 Print a help message describing the usage of sacctmgr. This is
38 equivalent to the help command.
39
40
41 -i, --immediate
42 commit changes immediately without asking for confirmation.
43
44
45 -n, --noheader
46 No header will be added to the beginning of the output.
47
48
49 -p, --parsable
50 Output will be '|' delimited with a '|' at the end.
51
52
53 -P, --parsable2
54 Output will be '|' delimited without a '|' at the end.
55
56
57 -Q, --quiet
58 Print no messages other than error messages. This is equivalent
59 to the quiet command.
60
61
62 -r, --readonly
63 Makes it so the running sacctmgr cannot modify accounting infor‐
64 mation. The readonly option is for use within interactive mode.
65
66
67 -s, --associations
68 Use with show or list to display associations with the entity.
69 This is equivalent to the associations command.
70
71
72 -v, --verbose
73 Enable detailed logging. This is equivalent to the verbose com‐
74 mand.
75
76
77 -V , --version
78 Display version number. This is equivalent to the version com‐
79 mand.
80
81
83 add <ENTITY> <SPECS>
84 Add an entity. Identical to the create command.
85
86
87 associations
88 Use with show or list to display associations with the entity.
89
90
91 clear stats
92 Clear the server statistics.
93
94
95 create <ENTITY> <SPECS>
96 Add an entity. Identical to the add command.
97
98
99 delete <ENTITY> where <SPECS>
100 Delete the specified entities.
101
102
103 dump <ENTITY> <File=FILENAME>
104 Dump cluster data to the specified file. If the filename is not
105 specified it uses clustername.cfg filename by default.
106
107
108 help Display a description of sacctmgr options and commands.
109
110
111 list <ENTITY> [<SPECS>]
112 Display information about the specified entity. By default, all
113 entries are displayed, you can narrow results by specifying
114 SPECS in your query. Identical to the show command.
115
116
117 load <FILENAME>
118 Load cluster data from the specified file. This is a configura‐
119 tion file generated by running the sacctmgr dump command. This
120 command does not load archive data, see the sacctmgr archive
121 load option instead.
122
123
124 modify <ENTITY> where <SPECS> set <SPECS>
125 Modify an entity.
126
127
128 problem
129 Use with show or list to display entity problems.
130
131
132 reconfigure
133 Reconfigures the SlurmDBD if running with one.
134
135
136 show <ENTITY> [<SPECS>]
137 Display information about the specified entity. By default, all
138 entries are displayed, you can narrow results by specifying
139 SPECS in your query. Identical to the list command.
140
141
142 shutdown
143 Shutdown the server.
144
145
146 version
147 Display the version number of sacctmgr.
148
149
151 NOTE: All commands listed below can be used in the interactive mode,
152 but NOT on the initial command line.
153
154
155 exit Terminate sacctmgr interactive mode. Identical to the quit com‐
156 mand.
157
158
159 quiet Print no messages other than error messages.
160
161
162 quit Terminate the execution of sacctmgr interactive mode. Identical
163 to the exit command.
164
165
166 verbose
167 Enable detailed logging. This includes time-stamps on data
168 structures, record counts, etc. This is an independent command
169 with no options meant for use in interactive mode.
170
171
172 !! Repeat the last command.
173
174
176 account
177 A bank account, typically specified at job submit time using the
178 --account= option. These may be arranged in a hierarchical
179 fashion, for example accounts chemistry and physics may be chil‐
180 dren of the account science. The hierarchy may have an arbi‐
181 trary depth.
182
183
184 association
185 The entity used to group information consisting of four parame‐
186 ters: account, cluster, partition (optional), and user. Used
187 only with the list or show command. Add, modify, and delete
188 should be done to a user, account or cluster entity. This will
189 in-turn update the underlying associations.
190
191
192 cluster
193 The ClusterName parameter in the slurm.conf configuration file,
194 used to differentiate accounts on different machines.
195
196
197 configuration
198 Used only with the list or show command to report current system
199 configuration.
200
201
202 coordinator
203 A special privileged user usually an account manager or such
204 that can add users or sub accounts to the account they are coor‐
205 dinator over. This should be a trusted person since they can
206 change limits on account and user associations inside their
207 realm.
208
209
210 event Events like downed or draining nodes on clusters.
211
212
213 federation
214 A group of clusters that work together to schedule jobs.
215
216
217 job Used to modify specific fields of a job: Derived Exit Code and
218 the Comment String.
219
220
221 qos Quality of Service.
222
223
224 Resource
225 Software resources for the system. Those are software licenses
226 shared among clusters.
227
228
229 RunawayJobs
230 Used only with the list or show command to report current jobs
231 that have been orphanded on the local cluster and are now run‐
232 away. If there are jobs in this state it will also give you an
233 option to "fix" them.
234
235
236 stats Used with list or show command to view server statistics.
237 Accepts optional argument of ave_time or total_time to sort on
238 those fields. By default, sorts on increasing RPC count field.
239
240
241 transaction
242 List of transactions that have occurred during a given time
243 period.
244
245
246 user The login name. Only lowercase usernames are supported.
247
248
249 wckeys Workload Characterization Key. An arbitrary string for
250 grouping orthogonal accounts.
251
252
254 NOTE: The group limits (GrpJobs, GrpTRES, etc.) are tested when a job
255 is being considered for being allocated resources. If starting a job
256 would cause any of its group limit to be exceeded, that job will not be
257 considered for scheduling even if that job might preempt other jobs
258 which would release sufficient group resources for the pending job to
259 be initiated.
260
261
262 DefaultQOS=<default qos>
263 The default QOS this association and its children should have.
264 This is overridden if set directly on a user. To clear a previ‐
265 ously set value use the modify command with a new value of -1.
266
267
268 Fairshare=<fairshare number | parent>
269 Number used in conjunction with other accounts to determine job
270 priority. Can also be the string parent, when used on a user
271 this means that the parent association is used for fairshare.
272 If Fairshare=parent is set on an account, that account's chil‐
273 dren will be effectively reparented for fairshare calculations
274 to the first parent of their parent that is not Fairshare=par‐
275 ent. Limits remain the same, only it's fairshare value is
276 affected. To clear a previously set value use the modify com‐
277 mand with a new value of -1.
278
279
280 GraceTime=<preemption grace time in seconds>
281 Specifies, in units of seconds, the preemption grace time to be
282 extended to a job which has been selected for preemption. The
283 default value is zero, no preemption grace time is allowed on
284 this QOS.
285
286 NOTE: This value is only meaningful for QOS PreemptMode=CANCEL.
287
288
289 GrpTRESMins=<TRES=max TRES minutes,...>
290 The total number of TRES minutes that can possibly be used by
291 past, present and future jobs running from this association and
292 its children. To clear a previously set value use the modify
293 command with a new value of -1 for each TRES id.
294
295 NOTE: This limit is not enforced if set on the root association
296 of a cluster. So even though it may appear in sacctmgr output,
297 it will not be enforced.
298
299 ALSO NOTE: This limit only applies when using the Priority Mul‐
300 tifactor plugin. The time is decayed using the value of Priori‐
301 tyDecayHalfLife or PriorityUsageResetPeriod as set in the
302 slurm.conf. When this limit is reached all associated jobs run‐
303 ning will be killed and all future jobs submitted with associa‐
304 tions in the group will be delayed until they are able to run
305 inside the limit.
306
307
308 GrpTRESRunMins=<TRES=max TRES run minutes,...>
309 Used to limit the combined total number of TRES minutes used by
310 all jobs running with this association and its children. This
311 takes into consideration time limit of running jobs and consumes
312 it, if the limit is reached no new jobs are started until other
313 jobs finish to allow time to free up.
314
315
316 GrpTRES=<TRES=max TRES,...>
317 Maximum number of TRES running jobs are able to be allocated in
318 aggregate for this association and all associations which are
319 children of this association. To clear a previously set value
320 use the modify command with a new value of -1 for each TRES id.
321
322 NOTE: This limit only applies fully when using the Select Con‐
323 sumable Resource plugin.
324
325
326 GrpJobs=<max jobs>
327 Maximum number of running jobs in aggregate for this association
328 and all associations which are children of this association. To
329 clear a previously set value use the modify command with a new
330 value of -1.
331
332
333 GrpJobsAccrue=<max jobs>
334 Maximum number of pending jobs in aggregate able to accrue age
335 priority for this association and all associations which are
336 children of this association. To clear a previously set value
337 use the modify command with a new value of -1.
338
339
340 GrpSubmitJobs=<max jobs>
341 Maximum number of jobs which can be in a pending or running
342 state at any time in aggregate for this association and all
343 associations which are children of this association. To clear a
344 previously set value use the modify command with a new value of
345 -1.
346
347
348 GrpWall=<max wall>
349 Maximum wall clock time running jobs are able to be allocated in
350 aggregate for this association and all associations which are
351 children of this association. To clear a previously set value
352 use the modify command with a new value of -1.
353
354 NOTE: This limit is not enforced if set on the root association
355 of a cluster. So even though it may appear in sacctmgr output,
356 it will not be enforced.
357
358 ALSO NOTE: This limit only applies when using the Priority Mul‐
359 tifactor plugin. The time is decayed using the value of Priori‐
360 tyDecayHalfLife or PriorityUsageResetPeriod as set in the
361 slurm.conf. When this limit is reached all associated jobs run‐
362 ning will be killed and all future jobs submitted with associa‐
363 tions in the group will be delayed until they are able to run
364 inside the limit.
365
366
367 MaxTRESMins=<max TRES minutes>
368 Maximum number of TRES minutes each job is able to use in this
369 association. This is overridden if set directly on a user.
370 Default is the cluster's limit. To clear a previously set value
371 use the modify command with a new value of -1 for each TRES id.
372
373
374 MaxTRES=<max TRES>
375 Maximum number of TRES each job is able to use in this associa‐
376 tion. This is overridden if set directly on a user. Default is
377 the cluster's limit. To clear a previously set value use the
378 modify command with a new value of -1 for each TRES id.
379
380 NOTE: This limit only applies fully when using the Select Con‐
381 sumable Resource plugin.
382
383
384 MaxJobs=<max jobs>
385 Maximum number of jobs each user is allowed to run at one time
386 in this association. This is overridden if set directly on a
387 user. Default is the cluster's limit. To clear a previously
388 set value use the modify command with a new value of -1.
389
390
391 MaxJobsAccrue=<max jobs>
392 Maximum number of pending jobs able to accrue age priority at
393 any given time for the given association. This is overridden if
394 set directly on a user. Default is the cluster's limit. To
395 clear a previously set value use the modify command with a new
396 value of -1.
397
398
399 MaxSubmitJobs=<max jobs>
400 Maximum number of jobs which can this association can have in a
401 pending or running state at any time. Default is the cluster's
402 limit. To clear a previously set value use the modify command
403 with a new value of -1.
404
405
406 MaxWall=<max wall>
407 Maximum wall clock time each job is able to use in this associa‐
408 tion. This is overridden if set directly on a user. Default is
409 the cluster's limit. <max wall> format is <min> or <min>:<sec>
410 or <hr>:<min>:<sec> or <days>-<hr>:<min>:<sec> or <days>-<hr>.
411 The value is recorded in minutes with rounding as needed. To
412 clear a previously set value use the modify command with a new
413 value of -1.
414
415 NOTE: Changing this value will have no effect on any running or
416 pending job.
417
418
419 QosLevel<operator><comma separated list of qos names>
420 Specify the default Quality of Service's that jobs are able to
421 run at for this association. To get a list of valid QOS's use
422 'sacctmgr list qos'. This value will override its parents value
423 and push down to its children as the new default. Setting a
424 QosLevel to '' (two single quotes with nothing between them)
425 restores its default setting. You can also use the operator +=
426 and -= to add or remove certain QOS's from a QOS list.
427
428 Valid <operator> values include:
429
430 = Set QosLevel to the specified value. Note: the QOS that can
431 be used at a given account in the hierarchy are inherited
432 by the children of that account. By assigning QOS with the
433 = sign only the assigned QOS can be used by the account and
434 its children.
435
436 += Add the specified <qos> value to the current QosLevel. The
437 account will have access to this QOS and the other previ‐
438 ously assigned to it.
439
440 -= Remove the specified <qos> value from the current QosLevel.
441
442
443 See the EXAMPLES section below.
444
445
447 Cluster=<cluster>
448 Specific cluster to add account to. Default is all in system.
449
450
451 Description=<description>
452 An arbitrary string describing an account.
453
454
455 Name=<name>
456 The name of a bank account. Note the name must be unique and
457 can not be represent different bank accounts at different points
458 in the account hierarchy.
459
460
461 Organization=<org>
462 Organization to which the account belongs.
463
464
465 Parent=<parent>
466 Parent account of this account. Default is the root account, a
467 top level account.
468
469
470 RawUsage=<value>
471 This allows an administrator to reset the raw usage accrued to
472 an account. The only value currently supported is 0 (zero).
473 This is a settable specification only - it cannot be used as a
474 filter to list accounts.
475
476
477 WithAssoc
478 Display all associations for this account.
479
480
481 WithCoord
482 Display all coordinators for this account.
483
484
485 WithDeleted
486 Display information with previously deleted data.
487
488 NOTE: If using the WithAssoc option you can also query against associa‐
489 tion specific information to view only certain associations this
490 account may have. These extra options can be found in the SPECIFICA‐
491 TIONS FOR ASSOCIATIONS section. You can also use the general specifi‐
492 cations list above in the GENERAL SPECIFICATIONS FOR ASSOCIATION BASED
493 ENTITIES section.
494
495
497 Account
498 The name of a bank account.
499
500
501 Description
502 An arbitrary string describing an account.
503
504
505 Organization
506 Organization to which the account belongs.
507
508
509 Coordinators
510 List of users that are a coordinator of the account. (Only
511 filled in when using the WithCoordinator option.)
512
513 NOTE: If using the WithAssoc option you can also view the information
514 about the various associations the account may have on all the clusters
515 in the system. The association information can be filtered. Note that
516 all the accounts in the database will always be shown as filter only
517 takes effect over the association data. The Association format fields
518 are described in the LIST/SHOW ASSOCIATION FORMAT OPTIONS section.
519
520
521
523 Clusters=<comma separated list of cluster names>
524 List the associations of the cluster(s).
525
526
527 Accounts=<comma separated list of account names>
528 List the associations of the account(s).
529
530
531 Users=<comma separated list of user names>
532 List the associations of the user(s).
533
534
535 Partition=<comma separated list of partition names>
536 List the associations of the partition(s).
537
538 NOTE: You can also use the general specifications list above in the
539 GENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES section.
540
541 Other options unique for listing associations:
542
543
544 OnlyDefaults
545 Display only associations that are default associations
546
547
548 Tree Display account names in a hierarchical fashion.
549
550
551 WithDeleted
552 Display information with previously deleted data.
553
554
555 WithSubAccounts
556 Display information with subaccounts. Only really valuable when
557 used with the account= option. This will display all the subac‐
558 count associations along with the accounts listed in the option.
559
560
561 WOLimits
562 Display information without limit information. This is for a
563 smaller default format of "Cluster,Account,User,Partition".
564
565
566 WOPInfo
567 Display information without parent information (i.e. parent id,
568 and parent account name). This option also implicitly sets the
569 WOPLimits option.
570
571
572 WOPLimits
573 Display information without hierarchical parent limits (i.e.
574 will only display limits where they are set instead of propagat‐
575 ing them from the parent).
576
577
578
580 Account
581 The name of a bank account in the association.
582
583
584 Cluster
585 The name of a cluster in the association.
586
587
588 DefaultQOS
589 The QOS the association will use by default if it as access to
590 it in the QOS list mentioned below.
591
592
593 Fairshare
594 Number used in conjunction with other accounts to determine job
595 priority. Can also be the string parent, when used on a user
596 this means that the parent association is used for fairshare.
597 If Fairshare=parent is set on an account, that account's chil‐
598 dren will be effectively reparented for fairshare calculations
599 to the first parent of their parent that is not Fairshare=par‐
600 ent. Limits remain the same, only it's fairshare value is
601 affected.
602
603
604 GrpTRESMins
605 The total number of TRES minutes that can possibly be used by
606 past, present and future jobs running from this association and
607 its children.
608
609
610 GrpTRESRunMins
611 Used to limit the combined total number of TRES minutes used by
612 all jobs running with this association and its children. This
613 takes into consideration time limit of running jobs and consumes
614 it, if the limit is reached no new jobs are started until other
615 jobs finish to allow time to free up.
616
617
618 GrpTRES
619 Maximum number of TRES running jobs are able to be allocated in
620 aggregate for this association and all associations which are
621 children of this association.
622
623
624 GrpJobs
625 Maximum number of running jobs in aggregate for this association
626 and all associations which are children of this association.
627
628
629 GrpJobsAccrue
630 Maximum number of pending jobs in aggregate able to accrue age
631 priority for this association and all associations which are
632 children of this association.
633
634
635 GrpSubmitJobs
636 Maximum number of jobs which can be in a pending or running
637 state at any time in aggregate for this association and all
638 associations which are children of this association.
639
640
641 GrpWall
642 Maximum wall clock time running jobs are able to be allocated in
643 aggregate for this association and all associations which are
644 children of this association.
645
646
647 ID The id of the association.
648
649
650 LFT Associations are kept in a hierarchy: this is the left most spot
651 in the hierarchy. When used with the RGT variable, all associa‐
652 tions with a LFT inside this LFT and before the RGT are children
653 of this association.
654
655
656 MaxTRESMins
657 Maximum number of TRES minutes each job is able to use.
658
659
660 MaxTRES
661 Maximum number of TRES each job is able to use.
662
663
664 MaxJobs
665 Maximum number of jobs each user is allowed to run at one time.
666
667
668 MaxJobsAccrue
669 Maximum number of pending jobs able to accrue age priority at
670 any given time.
671
672
673 MaxSubmitJobs
674 Maximum number of jobs pending or running state at any time.
675
676
677 MaxWall
678 Maximum wall clock time each job is able to use.
679
680
681 Qos Valid QOS´ for this association.
682
683
684 ParentID
685 The association id of the parent of this association.
686
687
688 ParentName
689 The account name of the parent of this association.
690
691
692 Partition
693 The name of a partition in the association.
694
695
696 WithRawQOSLevel
697 Display QosLevel in an unevaluated raw format, consisting of a
698 comma separated list of QOS names prepended with '' (nothing),
699 '+' or '-' for the association. QOS names without +/- prepended
700 were assigned (ie, sacctmgr modify ... set QosLevel=qos_name)
701 for the entity listed or on one of its parents in the hierarchy.
702 QOS names with +/- prepended indicate the QOS was added/filtered
703 (ie, sacctmgr modify ... set QosLevel=[+-]qos_name) for the
704 entity listed or on one of its parents in the hierarchy. Includ‐
705 ing WOPLimits will show exactly where each QOS was assigned,
706 added or filtered in the hierarchy.
707
708
709 RGT Associations are kept in a hierarchy: this is the right most
710 spot in the hierarchy. When used with the LFT variable, all
711 associations with a LFT inside this RGT and after the LFT are
712 children of this association.
713
714
715 User The name of a user in the association.
716
717
719 Classification=<classification>
720 Type of machine, current classifications are capability and
721 capacity.
722
723
724 Features=<comma separated list of feature names>
725 Features that are specific to the cluster. Federated jobs can be
726 directed to clusters that contain the job requested features.
727
728
729 Federation=<federation>
730 The federation that this cluster should be a member of. A clus‐
731 ter can only be a member of one federation at a time.
732
733
734 FedState=<state>
735 The state of the cluster in the federation.
736 Valid states are:
737
738 ACTIVE Cluster will actively accept and schedule federated jobs.
739
740
741 INACTIVE
742 Cluster will not schedule or accept any jobs.
743
744
745 DRAIN Cluster will not accept any new jobs and will let exist‐
746 ing federated jobs complete.
747
748
749 DRAIN+REMOVE
750 Cluster will not accept any new jobs and will remove
751 itself from the federation once all federated jobs have
752 completed. When removed from the federation, the cluster
753 will accept jobs as a non-federated cluster.
754
755
756 Flags=<flag list>
757 Comma separated list of Attributes for a particular cluster.
758 Current Flags include CrayXT, FrontEnd, and MultipleSlurmd.
759
760
761 Name=<name>
762 The name of a cluster. This should be equal to the ClusterName
763 parameter in the slurm.conf configuration file for some
764 Slurm-managed cluster.
765
766
767 RPC=<rpc list>
768 Comma separated list of numeric RPC values.
769
770
771 WithFed
772 Appends federation related columns to default format options
773 (e.g. Federation,ID,Features,FedState).
774
775
776 WOLimits
777 Display information without limit information. This is for a
778 smaller default format of Cluster,ControlHost,ControlPort,RPC
779
780 NOTE: You can also use the general specifications list above in the
781 GENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES section.
782
783
784
786 Classification
787 Type of machine, i.e. capability or capacity.
788
789
790 Cluster
791 The name of the cluster.
792
793
794 ControlHost
795 When a slurmctld registers with the database the ip address of
796 the controller is placed here.
797
798
799 ControlPort
800 When a slurmctld registers with the database the port the con‐
801 troller is listening on is placed here.
802
803
804 Features
805 The list of features on the cluster (if any).
806
807
808 Federation
809 The name of the federation this cluster is a member of (if any).
810
811
812 FedState
813 The state of the cluster in the federation (if a member of one).
814
815
816 FedStateRaw
817 Numeric value of the name of the FedState.
818
819
820 Flags Attributes possessed by the cluster.
821
822
823 ID The ID assigned to the cluster when a member of a federation.
824 This ID uniquely identifies the cluster and its jobs in the fed‐
825 eration.
826
827
828 NodeCount
829 The current count of nodes associated with the cluster.
830
831
832 NodeNames
833 The current Nodes associated with the cluster.
834
835
836 PluginIDSelect
837 The numeric value of the select plugin the cluster is using.
838
839
840 RPC When a slurmctld registers with the database the rpc version the
841 controller is running is placed here.
842
843
844 TRES Trackable RESources (Billing, BB (Burst buffer), CPU, Energy,
845 GRES, License, Memory, and Node) this cluster is accounting for.
846
847
848 NOTE: You can also view the information about the root association for
849 the cluster. The Association format fields are described in the
850 LIST/SHOW ASSOCIATION FORMAT OPTIONS section.
851
852
853
855 Account=<comma separated list of account names>
856 Account name to add this user as a coordinator to.
857
858 Names=<comma separated list of user names>
859 Names of coordinators.
860
861 NOTE: To list coordinators use the WithCoordinator options with list
862 account or list user.
863
864
865
867 All_Clusters
868 Get information on all cluster shortcut.
869
870
871 All_Time
872 Get time period for all time shortcut.
873
874
875 Clusters=<comma separated list of cluster names>
876 List the events of the cluster(s). Default is the cluster where
877 the command was run.
878
879
880 End=<OPT>
881 Period ending of events. Default is now.
882
883 Valid time formats are...
884
885 HH:MM[:SS] [AM|PM]
886 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
887 MM/DD[/YY]-HH:MM[:SS]
888 YYYY-MM-DD[THH:MM[:SS]]
889
890
891 Event=<OPT>
892 Specific events to look for, valid options are Cluster or Node,
893 default is both.
894
895
896 MaxTRES=<OPT>
897 Max number of TRES affected by an event.
898
899
900 MinTRES=<OPT>
901 Min number of TRES affected by an event.
902
903
904 Nodes=<comma separated list of node names>
905 Node names affected by an event.
906
907
908 Reason=<comma separated list of reasons>
909 Reason an event happened.
910
911
912 Start=<OPT>
913 Period start of events. Default is 00:00:00 of previous day,
914 unless states are given with the States= spec events. If this
915 is the case the default behavior is to return events currently
916 in the states specified.
917
918 Valid time formats are...
919
920 HH:MM[:SS] [AM|PM]
921 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
922 MM/DD[/YY]-HH:MM[:SS]
923 YYYY-MM-DD[THH:MM[:SS]]
924
925
926 States=<comma separated list of states>
927 State of a node in a node event. If this is set, the event type
928 is set automatically to Node.
929
930
931 User=<comma separated list of users>
932 Query against users who set the event. If this is set, the
933 event type is set automatically to Node since only user slurm
934 can perform a cluster event.
935
936
937
939 Cluster
940 The name of the cluster event happened on.
941
942
943 ClusterNodes
944 The hostlist of nodes on a cluster in a cluster event.
945
946
947 Duration
948 Time period the event was around for.
949
950
951 End Period when event ended.
952
953
954 Event Name of the event.
955
956
957 EventRaw
958 Numeric value of the name of the event.
959
960
961 NodeName
962 The node affected by the event. In a cluster event, this is
963 blank.
964
965
966 Reason The reason an event happened.
967
968
969 Start Period when event started.
970
971
972 State On a node event this is the formatted state of the node during
973 the event.
974
975
976 StateRaw
977 On a node event this is the numeric value of the state of the
978 node during the event.
979
980
981 TRES Number of TRES involved with the event.
982
983
984 User On a node event this is the user who caused the event to happen.
985
986
987
989 Clusters[+|-]=<comma separated list of cluster names>
990 List of clusters to add/remove to a federation. A blank value
991 (e.g. clusters=) will remove all federations for the federation.
992 NOTE: a cluster can only be a member of one federation.
993
994
995 Name=<name>
996 The name of the federation.
997
998
999 Tree Display federations in a hierarchical fashion.
1000
1001
1003 Features
1004 The list of features on the cluster.
1005
1006
1007 Federation
1008 The name of the federation.
1009
1010
1011 Cluster
1012 Name of the cluster that is a member of the federation.
1013
1014
1015 FedState
1016 The state of the cluster in the federation.
1017
1018
1019 FedStateRaw
1020 Numeric value of the name of the FedState.
1021
1022
1023 Index The index of the cluster in the federation.
1024
1025
1026
1028 DerivedExitCode
1029 The derived exit code can be modified after a job completes
1030 based on the user's judgement of whether the job succeeded or
1031 failed. The user can only modify the derived exit code of their
1032 own job.
1033
1034
1035 Comment
1036 The job's comment string when the AccountingStoreJobComment
1037 parameter in the slurm.conf file is set (or defaults) to YES.
1038 The user can only modify the comment string of their own job.
1039
1040
1041 The DerivedExitCode and Comment fields are the only fields
1042 of a job record in the database that can be modified after job
1043 completion.
1044
1045
1047 The sacct command is the exclusive command to display job records from
1048 the Slurm database.
1049
1050
1052 NOTE: The group limits (GrpJobs, GrpNodes, etc.) are tested when a job
1053 is being considered for being allocated resources. If starting a job
1054 would cause any of its group limit to be exceeded, that job will not be
1055 considered for scheduling even if that job might preempt other jobs
1056 which would release sufficient group resources for the pending job to
1057 be initiated.
1058
1059
1060 Flags Used by the slurmctld to override or enforce certain character‐
1061 istics.
1062 Valid options are
1063
1064 DenyOnLimit
1065 If set, jobs using this QOS will be rejected at submis‐
1066 sion time if they do not conform to the QOS 'Max' limits.
1067 Group limits will also be treated like 'Max' limits as
1068 well and will be denied if they go over. By default jobs
1069 that go over these limits will pend until they conform.
1070 This currently only applies to QOS and Association lim‐
1071 its.
1072
1073 EnforceUsageThreshold
1074 If set, and the QOS also has a UsageThreshold, any jobs
1075 submitted with this QOS that fall below the UsageThresh‐
1076 old will be held until their Fairshare Usage goes above
1077 the Threshold.
1078
1079 NoReserve
1080 If this flag is set and backfill scheduling is used, jobs
1081 using this QOS will not reserve resources in the backfill
1082 schedule's map of resources allocated through time. This
1083 flag is intended for use with a QOS that may be preempted
1084 by jobs associated with all other QOS (e.g use with a
1085 "standby" QOS). If this flag is used with a QOS which can
1086 not be preempted by all other QOS, it could result in
1087 starvation of larger jobs.
1088
1089 PartitionMaxNodes
1090 If set jobs using this QOS will be able to override the
1091 requested partition's MaxNodes limit.
1092
1093 PartitionMinNodes
1094 If set jobs using this QOS will be able to override the
1095 requested partition's MinNodes limit.
1096
1097 OverPartQOS
1098 If set jobs using this QOS will be able to override any
1099 limits used by the requested partition's QOS limits.
1100
1101 PartitionTimeLimit
1102 If set jobs using this QOS will be able to override the
1103 requested partition's TimeLimit.
1104
1105 RequiresReservaton
1106 If set jobs using this QOS must designate a reservation
1107 when submitting a job. This option can be useful in
1108 restricting usage of a QOS that may have greater preemp‐
1109 tive capability or additional resources to be allowed
1110 only within a reservation.
1111
1112 NoDecay
1113 If set, this QOS will not have its GrpTRESMins, GrpWall
1114 and UsageRaw decayed by the slurm.conf PriorityDecay‐
1115 HalfLife or PriorityUsageResetPeriod settings. This
1116 allows a QOS to provide aggregate limits that, once con‐
1117 sumed, will not be replenished automatically. Such a QOS
1118 will act as a time-limited quota of resources for an
1119 association that has access to it. Account/user usage
1120 will still be decayed for associations using the QOS.
1121 The QOS GrpTRESMins and GrpWall limits can be increased
1122 or the QOS RawUsage value reset to 0 (zero) to again
1123 allow jobs submitted with this QOS to be queued (if Deny‐
1124 OnLimit is set) or run (pending with QOSGrp{TRES}Minutes‐
1125 Limit or QOSGrpWallLimit reasons, where {TRES} is some
1126 type of trackable resource).
1127
1128
1129 GraceTime
1130 Preemption grace time to be extended to a job which has been
1131 selected for preemption.
1132
1133
1134 GrpTRESMins
1135 The total number of TRES minutes that can possibly be used by
1136 past, present and future jobs running from this QOS.
1137
1138
1139 GrpTRESRunMins Used to limit the combined total number of TRES
1140 minutes used by all jobs running with this QOS. This takes into
1141 consideration time limit of running jobs and consumes it, if the
1142 limit is reached no new jobs are started until other jobs finish
1143 to allow time to free up.
1144
1145
1146 GrpTRES
1147 Maximum number of TRES running jobs are able to be allocated in
1148 aggregate for this QOS.
1149
1150
1151 GrpJobs
1152 Maximum number of running jobs in aggregate for this QOS.
1153
1154
1155 GrpJobsAccrue
1156 Maximum number of pending jobs in aggregate able to accrue age
1157 priority for this QOS.
1158
1159
1160 GrpSubmitJobs
1161 Maximum number of jobs which can be in a pending or running
1162 state at any time in aggregate for this QOS.
1163
1164
1165 GrpWall
1166 Maximum wall clock time running jobs are able to be allocated in
1167 aggregate for this QOS. If this limit is reached submission
1168 requests will be denied and the running jobs will be killed.
1169
1170 ID The id of the QOS.
1171
1172
1173 MaxTRESMins
1174 Maximum number of TRES minutes each job is able to use.
1175
1176
1177 MaxTRESPerAccount
1178 Maximum number of TRES each account is able to use.
1179
1180
1181 MaxTRESPerJob
1182 Maximum number of TRES each job is able to use.
1183
1184
1185 MaxTRESPerNode
1186 Maximum number of TRES each node in a job allocation can use.
1187
1188
1189 MaxTRESPerUser
1190 Maximum number of TRES each user is able to use.
1191
1192
1193 MaxJobsAccruePerAccount
1194 Maximum number of pending jobs an account (or subacct) can have
1195 accruing age priority at any given time.
1196
1197
1198 MaxJobsAccruePerUser
1199 Maximum number of pending jobs a user can have accruing age pri‐
1200 ority at any given time.
1201
1202
1203 MaxJobsPerAccount
1204 Maximum number of jobs each account is allowed to run at one
1205 time.
1206
1207
1208 MaxJobsPerUser
1209 Maximum number of jobs each user is allowed to run at one time.
1210
1211
1212 MinPrioThreshold
1213 Minimum priority required to reserve resources when scheduling.
1214
1215
1216 MinTRESPerJob
1217 Minimum number of TRES each job running under this QOS must
1218 request. Otherwise the job will pend until modified.
1219
1220
1221 MaxSubmitJobsPerAccount
1222 Maximum number of jobs pending or running state at any time per
1223 account.
1224
1225
1226 MaxSubmitJobsPerUser
1227 Maximum number of jobs pending or running state at any time per
1228 user.
1229
1230
1231 MaxWall
1232 Maximum wall clock time each job is able to use.
1233
1234
1235 Name Name of the QOS.
1236
1237
1238 Preempt
1239 Other QOS´ this QOS can preempt.
1240
1241
1242 PreemptMode
1243 Mechanism used to preempt jobs of this QOS if the clusters Pre‐
1244 emptType is configured to preempt/qos. The default preemption
1245 mechanism is specified by the cluster-wide PreemptMode configu‐
1246 ration parameter. Possible values are "Cluster" (meaning use
1247 cluster default), "Cancel", "Checkpoint" and "Requeue". This
1248 option is not compatible with PreemptMode=OFF or Preempt‐
1249 Mode=SUSPEND (i.e. preempted jobs must be removed from the
1250 resources).
1251
1252
1253 Priority
1254 What priority will be added to a job´s priority when using this
1255 QOS.
1256
1257
1258 RawUsage=<value>
1259 This allows an administrator to reset the raw usage accrued to a
1260 QOS. The only value currently supported is 0 (zero). This is a
1261 settable specification only - it cannot be used as a filter to
1262 list accounts.
1263
1264
1265 UsageFactor
1266 Usage factor when running with this QOS.
1267
1268
1269 UsageThreshold
1270 A float representing the lowest fairshare of an association
1271 allowable to run a job. If an association falls below this
1272 threshold and has pending jobs or submits new jobs those jobs
1273 will be held until the usage goes back above the threshold. Use
1274 sshare to see current shares on the system.
1275
1276
1277 WithDeleted
1278 Display information with previously deleted data.
1279
1280
1281
1283 Description
1284 An arbitrary string describing a QOS.
1285
1286
1287 GraceTime
1288 Preemption grace time to be extended to a job which has been
1289 selected for preemption in the format of hh:mm:ss. The default
1290 value is zero, no preemption grace time is allowed on this par‐
1291 tition. NOTE: This value is only meaningful for QOS Preempt‐
1292 Mode=CANCEL.
1293
1294
1295 GrpTRESMins
1296 The total number of TRES minutes that can possibly be used by
1297 past, present and future jobs running from this QOS. To clear a
1298 previously set value use the modify command with a new value of
1299 -1 for each TRES id. NOTE: This limit only applies when using
1300 the Priority Multifactor plugin. The time is decayed using the
1301 value of PriorityDecayHalfLife or PriorityUsageResetPeriod as
1302 set in the slurm.conf. When this limit is reached all associ‐
1303 ated jobs running will be killed and all future jobs submitted
1304 with this QOS will be delayed until they are able to run inside
1305 the limit.
1306
1307
1308 GrpTRES
1309 Maximum number of TRES running jobs are able to be allocated in
1310 aggregate for this QOS. To clear a previously set value use the
1311 modify command with a new value of -1 for each TRES id.
1312
1313
1314 GrpJobs
1315 Maximum number of running jobs in aggregate for this QOS. To
1316 clear a previously set value use the modify command with a new
1317 value of -1.
1318
1319
1320 GrpJobsAccrue
1321 Maximum number of pending jobs in aggregate able to accrue age
1322 priority for this QOS. To clear a previously set value use the
1323 modify command with a new value of -1.
1324
1325
1326 GrpSubmitJobs
1327 Maximum number of jobs which can be in a pending or running
1328 state at any time in aggregate for this QOS. To clear a previ‐
1329 ously set value use the modify command with a new value of -1.
1330
1331
1332 GrpWall
1333 Maximum wall clock time running jobs are able to be allocated in
1334 aggregate for this QOS. To clear a previously set value use the
1335 modify command with a new value of -1. NOTE: This limit only
1336 applies when using the Priority Multifactor plugin. The time is
1337 decayed using the value of PriorityDecayHalfLife or Priori‐
1338 tyUsageResetPeriod as set in the slurm.conf. When this limit is
1339 reached all associated jobs running will be killed and all
1340 future jobs submitted with this QOS will be delayed until they
1341 are able to run inside the limit.
1342
1343
1344 MaxTRESMins
1345 Maximum number of TRES minutes each job is able to use. To
1346 clear a previously set value use the modify command with a new
1347 value of -1 for each TRES id.
1348
1349
1350 MaxTRESPerAccount
1351 Maximum number of TRES each account is able to use. To clear a
1352 previously set value use the modify command with a new value of
1353 -1 for each TRES id.
1354
1355
1356 MaxTRESPerJob
1357 Maximum number of TRES each job is able to use. To clear a pre‐
1358 viously set value use the modify command with a new value of -1
1359 for each TRES id.
1360
1361
1362 MaxTRESPerNode
1363 Maximum number of TRES each node in a job allocation can use.
1364 To clear a previously set value use the modify command with a
1365 new value of -1 for each TRES id.
1366
1367
1368 MaxTRESPerUser
1369 Maximum number of TRES each user is able to use. To clear a
1370 previously set value use the modify command with a new value of
1371 -1 for each TRES id.
1372
1373
1374 MaxJobsPerAccount
1375 Maximum number of jobs each account is allowed to run at one
1376 time. To clear a previously set value use the modify command
1377 with a new value of -1.
1378
1379
1380 MaxJobsPerUser
1381 Maximum number of jobs each user is allowed to run at one time.
1382 To clear a previously set value use the modify command with a
1383 new value of -1.
1384
1385
1386 MaxSubmitJobsPerAccount
1387 Maximum number of jobs pending or running state at any time per
1388 account. To clear a previously set value use the modify command
1389 with a new value of -1.
1390
1391
1392 MaxSubmitJobsPerUser
1393 Maximum number of jobs pending or running state at any time per
1394 user. To clear a previously set value use the modify command
1395 with a new value of -1.
1396
1397
1398 MaxWall
1399 Maximum wall clock time each job is able to use. <max wall>
1400 format is <min> or <min>:<sec> or <hr>:<min>:<sec> or
1401 <days>-<hr>:<min>:<sec> or <days>-<hr>. The value is recorded
1402 in minutes with rounding as needed. To clear a previously set
1403 value use the modify command with a new value of -1.
1404
1405
1406 MinPrioThreshold
1407 Minimum priority required to reserve resources when scheduling.
1408 To clear a previously set value use the modify command with a
1409 new value of -1.
1410
1411
1412 MinTRES
1413 Minimum number of TRES each job running under this QOS must
1414 request. Otherwise the job will pend until modified. To clear
1415 a previously set value use the modify command with a new value
1416 of -1 for each TRES id.
1417
1418
1419 Name Name of the QOS. Needed for creation.
1420
1421
1422 Preempt
1423 Other QOS´ this QOS can preempt. Setting a Preempt to '' (two
1424 single quotes with nothing between them) restores its default
1425 setting. You can also use the operator += and -= to add or
1426 remove certain QOS's from a QOS list.
1427
1428
1429 PreemptMode
1430 Mechanism used to preempt jobs of this QOS if the clusters Pre‐
1431 emptType is configured to preempt/qos. The default preemption
1432 mechanism is specified by the cluster-wide PreemptMode configu‐
1433 ration parameter. Possible values are "Cluster" (meaning use
1434 cluster default), "Cancel", "Checkpoint" and "Requeue". This
1435 option is not compatible with PreemptMode=OFF or Preempt‐
1436 Mode=SUSPEND (i.e. preempted jobs must be removed from the
1437 resources).
1438
1439
1440 Priority
1441 What priority will be added to a job´s priority when using this
1442 QOS. To clear a previously set value use the modify command
1443 with a new value of -1.
1444
1445
1446 UsageFactor
1447 Usage factor when running with this QOS. This is a float that
1448 is factored into the priority time calculations of running jobs.
1449 e.g. if the usagefactor of a QOS was 2 for every TRESBillingUnit
1450 second a job ran it would count for 2. Also if the usagefactor
1451 was .5, every second would only count for half of the time.
1452 Setting this value to 0 will make it so that running jobs will
1453 not add time to fairshare or association/qos limits. To clear a
1454 previously set value use the modify command with a new value of
1455 -1.
1456
1457
1459 Clusters=<comma separated list of cluster names>
1460 List the reservations of the cluster(s). Default is the cluster
1461 where the command was run.
1462
1463
1464 End=<OPT>
1465 Period ending of reservations. Default is now.
1466
1467 Valid time formats are...
1468
1469 HH:MM[:SS] [AM|PM]
1470 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
1471 MM/DD[/YY]-HH:MM[:SS]
1472 YYYY-MM-DD[THH:MM[:SS]]
1473
1474
1475 ID=<OPT>
1476 Comma separated list of reservation ids.
1477
1478
1479 Names=<OPT>
1480 Comma separated list of reservation names.
1481
1482
1483 Nodes=<comma separated list of node names>
1484 Node names where reservation ran.
1485
1486
1487 Start=<OPT>
1488 Period start of reservations. Default is 00:00:00 of current
1489 day.
1490
1491 Valid time formats are...
1492
1493 HH:MM[:SS] [AM|PM]
1494 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
1495 MM/DD[/YY]-HH:MM[:SS]
1496 YYYY-MM-DD[THH:MM[:SS]]
1497
1498
1500 Associations
1501 The id's of the associations able to run in the reservation.
1502
1503
1504 Cluster
1505 Name of cluster reservation was on.
1506
1507
1508 End End time of reservation.
1509
1510
1511 Flags Flags on the reservation.
1512
1513
1514 ID Reservation ID.
1515
1516
1517 Name Name of this reservation.
1518
1519
1520 NodeNames
1521 List of nodes in the reservation.
1522
1523
1524 Start Start time of reservation.
1525
1526
1527 TRES List of TRES in the reservation.
1528
1529
1530 UnusedWall
1531 Wall clock time in seconds unused by any job.
1532
1533
1534
1536 Clusters=<name list> Comma separated list of cluster names on which
1537 specified resources are to be available. If no names are designated
1538 then the clusters already allowed to use this resource will be altered.
1539
1540
1541 Count=<OPT>
1542 Number of software resources of a specific name configured on
1543 the system being controlled by a resource manager.
1544
1545
1546 Descriptions=
1547 A brief description of the resource.
1548
1549
1550 Flags=<OPT>
1551 Flags that identify specific attributes of the system resource.
1552 At this time no flags have been defined.
1553
1554
1555 ServerType=<OPT>
1556 The type of a software resource manager providing the licenses.
1557 For example FlexNext Publisher Flexlm license server or Reprise
1558 License Manager RLM.
1559
1560
1561 Names=<OPT>
1562 Comma separated list of the name of a resource configured on the
1563 system being controlled by a resource manager. If this resource
1564 is seen on the slurmctld it's name will be name@server to dis‐
1565 tinguish it from local resources defined in a slurm.conf.
1566
1567
1568 PercentAllowed=<percent allowed>
1569 Percentage of a specific resource that can be used on specified
1570 cluster.
1571
1572
1573 Server=<OPT>
1574 The name of the server serving up the resource. Default is
1575 'slurmdb' indicating the licenses are being served by the data‐
1576 base.
1577
1578
1579 Type=<OPT>
1580 The type of the resource represented by this record. Currently
1581 the only valid type is License.
1582
1583
1584 WithClusters
1585 Display the clusters percentage of resources. If a resource
1586 hasn't been given to a cluster the resource will not be dis‐
1587 played with this flag.
1588
1589
1590 NOTE: Resource is used to define each resource configured on a system
1591 available for usage by Slurm clusters.
1592
1593
1595 Cluster
1596 Name of cluster resource is given to.
1597
1598
1599 Count The count of a specific resource configured on the system glob‐
1600 ally.
1601
1602
1603 Allocated
1604 The percent of licenses allocated to a cluster.
1605
1606
1607 Description
1608 Description of the resource.
1609
1610
1611 ServerType
1612 The type of the server controlling the licenses.
1613
1614
1615 Name Name of this resource.
1616
1617
1618 Server Server serving up the resource.
1619
1620
1621 Type Type of resource this record represents.
1622
1623
1625 Cluster
1626 Name of cluster job ran on.
1627
1628
1629 ID Id of the job.
1630
1631
1632 Name Name of the job.
1633
1634
1635 Partition
1636 Partition job ran on.
1637
1638
1639 State Current State of the job in the database.
1640
1641
1642 TimeStart
1643 Time job started running.
1644
1645
1646 TimeEnd
1647 Current recorded time of the end of the job.
1648
1649
1651 Accounts=<comma separated list of account names>
1652 Only print out the transactions affecting specified accounts.
1653
1654
1655 Action=<Specific action the list will display>
1656
1657
1658 Actor=<Specific name the list will display>
1659 Only display transactions done by a certain person.
1660
1661
1662 Clusters=<comma separated list of cluster names>
1663 Only print out the transactions affecting specified clusters.
1664
1665
1666 End=<Date and time of last transaction to return>
1667 Return all transactions before this Date and time. Default is
1668 now.
1669
1670
1671 Start=<Date and time of first transaction to return>
1672 Return all transactions after this Date and time. Default is
1673 epoch.
1674
1675 Valid time formats for End and Start are...
1676
1677 HH:MM[:SS] [AM|PM]
1678 MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
1679 MM/DD[/YY]-HH:MM[:SS]
1680 YYYY-MM-DD[THH:MM[:SS]]
1681
1682
1683 Users=<comma separated list of user names>
1684 Only print out the transactions affecting specified users.
1685
1686
1687 WithAssoc
1688 Get information about which associations were affected by the
1689 transactions.
1690
1691
1692
1694 Action
1695
1696
1697 Actor
1698
1699
1700 Info
1701
1702
1703 TimeStamp
1704
1705
1706 Where
1707
1708 NOTE: If using the WithAssoc option you can also view the information
1709 about the various associations the transaction affected. The Associa‐
1710 tion format fields are described in the LIST/SHOW ASSOCIATION FORMAT
1711 OPTIONS section.
1712
1713
1714
1716 Account=<account>
1717 Account name to add this user to.
1718
1719
1720 AdminLevel=<level>
1721 Admin level of user. Valid levels are None, Operator, and
1722 Admin.
1723
1724
1725 Cluster=<cluster>
1726 Specific cluster to add user to the account on. Default is all
1727 in system.
1728
1729
1730 DefaultAccount=<account>
1731 Identify the default bank account name to be used for a job if
1732 none is specified at submission time.
1733
1734
1735 DefaultWCKey=<defaultwckey>
1736 Identify the default Workload Characterization Key.
1737
1738
1739 Name=<name>
1740 Name of user.
1741
1742
1743 NewName=<newname>
1744 Use to rename a user in the accounting database
1745
1746
1747 Partition=<name>
1748 Partition name.
1749
1750
1751 RawUsage=<value>
1752 This allows an administrator to reset the raw usage accrued to a
1753 user. The only value currently supported is 0 (zero). This is
1754 a settable specification only - it cannot be used as a filter to
1755 list users.
1756
1757
1758 WCKeys=<wckeys>
1759 Workload Characterization Key values.
1760
1761
1762 WithAssoc
1763 Display all associations for this user.
1764
1765
1766 WithCoord
1767 Display all accounts a user is coordinator for.
1768
1769
1770 WithDeleted
1771 Display information with previously deleted data.
1772
1773 NOTE: If using the WithAssoc option you can also query against associa‐
1774 tion specific information to view only certain associations this user
1775 may have. These extra options can be found in the SPECIFICATIONS FOR
1776 ASSOCIATIONS section. You can also use the general specifications list
1777 above in the GENERAL SPECIFICATIONS FOR ASSOCIATION BASED ENTITIES sec‐
1778 tion.
1779
1780
1781
1783 AdminLevel
1784 Admin level of user.
1785
1786
1787 DefaultAccount
1788 The user's default account.
1789
1790
1791 Coordinators
1792 List of users that are a coordinator of the account. (Only
1793 filled in when using the WithCoordinator option.)
1794
1795
1796 User The name of a user.
1797
1798 NOTE: If using the WithAssoc option you can also view the information
1799 about the various associations the user may have on all the clusters in
1800 the system. The association information can be filtered. Note that all
1801 the users in the database will always be shown as filter only takes
1802 effect over the association data. The Association format fields are
1803 described in the LIST/SHOW ASSOCIATION FORMAT OPTIONS section.
1804
1805
1806
1808 WCKey Workload Characterization Key.
1809
1810
1811 Cluster
1812 Specific cluster for the WCKey.
1813
1814
1815 User The name of a user for the WCKey.
1816
1817 NOTE: If using the WithAssoc option you can also view the information
1818 about the various associations the user may have on all the clusters in
1819 the system. The Association format fields are described in the
1820 LIST/SHOW ASSOCIATION FORMAT OPTIONS section.
1821
1822
1824 Name The name of the trackable resource. This option is required for
1825 TRES types BB (Burst buffer), GRES, and License. Types CPU,
1826 Energy, Memory, and Node do not have Names. For example if GRES
1827 is the type then name is the denomination of the GRES itself
1828 e.g. GPU.
1829
1830
1831 ID The identification number of the trackable resource as it
1832 appears in the database.
1833
1834
1835 Type The type of the trackable resource. Current types are BB (Burst
1836 buffer), CPU, Energy, GRES, License, Memory, and Node.
1837
1838
1840 Trackable RESources (TRES) are used in many QOS or Association limits.
1841 When setting the limits they are comma separated list. Each TRES has a
1842 different limit, i.e. GrpTRESMins=cpu=10,mem=20 would make 2 different
1843 limits 1 for 10 cpu minutes and 1 for 20 MB memory minutes. This is
1844 the case for each limit that deals with TRES. To remove the limit -1
1845 is used i.e. GrpTRESMins=cpu=-1 would remove only the cpu TRES limit.
1846
1847 NOTE: On GrpTRES limits dealing with nodes as a TRES. Each job's node
1848 allocation is counted separately (i.e. if a single node has resources
1849 allocated to two jobs, this is counted as two allocated nodes).
1850
1851 NOTE: When dealing with Memory as a TRES all limits are in MB.
1852
1853 NOTE: The Billing TRES is calculated from a partition's TRESBilling‐
1854 Weights. It is temporarily calculated during scheduling for each parti‐
1855 tion to enforce billing TRES limits. The final Billing TRES is calcu‐
1856 lated after the job has been allocated resources. The final number can
1857 be seen in scontrol show jobs and sacct output.
1858
1859
1861 When using the format option for listing various fields you can put a
1862 %NUMBER afterwards to specify how many characters should be printed.
1863
1864 e.g. format=name%30 will print 30 characters of field name right justi‐
1865 fied. A -30 will print 30 characters left justified.
1866
1867
1869 sacctmgr has the capability to load and dump Slurm association data to
1870 and from a file. This method can easily add a new cluster or copy an
1871 existing clusters associations into a new cluster with similar
1872 accounts. Each file contains Slurm association data for a single clus‐
1873 ter. Comments can be put into the file with the # character. Each
1874 line of information must begin with one of the four titles; Cluster,
1875 Parent, Account or User. Following the title is a space, dash, space,
1876 entity value, then specifications. Specifications are colon separated.
1877 If any variable such as Organization has a space in it, surround the
1878 name with single or double quotes.
1879
1880 To create a file of associations one can run
1881
1882 > sacctmgr dump tux file=tux.cfg
1883 (file=tux.cfg is optional)
1884
1885 To load a previously created file you can run
1886
1887 > sacctmgr load file=tux.cfg
1888
1889 Other options for load are -
1890
1891 clean - delete what was already there and start from scratch with this
1892 information.
1893 Cluster= - specify a different name for the cluster than that which is
1894 in the file.
1895
1896 Quick explanation how the file works.
1897
1898 Since the associations in the system follow a hierarchy, so does the
1899 file. Anything that is a parent needs to be defined before any chil‐
1900 dren. The only exception is the understood 'root' account. This is
1901 always a default for any cluster and does not need to be defined.
1902
1903 To edit/create a file start with a cluster line for the new cluster
1904
1905 Cluster - cluster_name:MaxNodesPerJob=15
1906
1907 Anything included on this line will be the defaults for all associa‐
1908 tions on this cluster. These options are as follows...
1909
1910 GrpTRESMins=
1911 The total number of TRES minutes that can possibly be used by
1912 past, present and future jobs running from this association and
1913 its children.
1914
1915 GrpTRESRunMins=
1916 Used to limit the combined total number of TRES minutes used by
1917 all jobs running with this association and its children. This
1918 takes into consideration time limit of running jobs and consumes
1919 it, if the limit is reached no new jobs are started until other
1920 jobs finish to allow time to free up.
1921
1922 GrpTRES=
1923 Maximum number of TRES running jobs are able to be allocated in
1924 aggregate for this association and all associations which are
1925 children of this association.
1926
1927 GrpJobs=
1928 Maximum number of running jobs in aggregate for this association
1929 and all associations which are children of this association.
1930
1931 GrpJobsAccrue
1932 Maximum number of pending jobs in aggregate able to accrue age
1933 priority for this association and all associations which are
1934 children of this association.
1935
1936 GrpNodes=
1937 Maximum number of nodes running jobs are able to be allocated in
1938 aggregate for this association and all associations which are
1939 children of this association.
1940
1941 NOTE: Each job's node allocation is counted separately (i.e. if a sin‐
1942 gle node has resources allocated to two jobs, this is counted as two
1943 allocated nodes).
1944
1945 GrpSubmitJobs=
1946 Maximum number of jobs which can be in a pending or running
1947 state at any time in aggregate for this association and all
1948 associations which are children of this association.
1949
1950 GrpWall=
1951 Maximum wall clock time running jobs are able to be allocated in
1952 aggregate for this association and all associations which are
1953 children of this association.
1954
1955 FairShare=
1956 Number used in conjunction with other associations to determine
1957 job priority.
1958
1959 MaxJobs=
1960 Maximum number of jobs the children of this association can run.
1961
1962 MaxNodesPerJob=
1963 Maximum number of nodes per job the children of this association
1964 can run.
1965
1966 MaxWallDurationPerJob=
1967 Maximum time (not related to job size) children of this accounts
1968 jobs can run.
1969
1970 QOS= Comma separated list of Quality of Service names (Defined in
1971 sacctmgr).
1972
1973
1974 Followed by Accounts you want in this fashion...
1975
1976 Parent - root (Defined by default)
1977 Account - cs:MaxNodesPerJob=5:MaxJobs=4:FairShare=399:MaxWallDu‐
1978 rationPerJob=40:Description='Computer Science':Organization='LC'
1979 Parent - cs
1980 Account - test:MaxNodesPerJob=1:MaxJobs=1:FairShare=1:MaxWallDu‐
1981 rationPerJob=1:Description='Test Account':Organization='Test'
1982
1983
1984 Any of the options after a ':' can be left out and they can be in any
1985 order.
1986 If you want to add any sub accounts just list the Parent THAT
1987 HAS ALREADY BEEN CREATED before the account line in this fash‐
1988 ion...
1989
1990 All account options are
1991
1992 Description=
1993 A brief description of the account.
1994
1995 GrpTRESMins=
1996 Maximum number of TRES hours running jobs are able to be allo‐
1997 cated in aggregate for this association and all associations
1998 which are children of this association. GrpTRESRunMins= Used to
1999 limit the combined total number of TRES minutes used by all jobs
2000 running with this association and its children. This takes into
2001 consideration time limit of running jobs and consumes it, if the
2002 limit is reached no new jobs are started until other jobs finish
2003 to allow time to free up.
2004
2005 GrpTRES=
2006 Maximum number of TRES running jobs are able to be allocated in
2007 aggregate for this association and all associations which are
2008 children of this association.
2009
2010 GrpJobs=
2011 Maximum number of running jobs in aggregate for this association
2012 and all associations which are children of this association.
2013
2014 GrpJobsAccrue
2015 Maximum number of pending jobs in aggregate able to accrue age
2016 priority for this association and all associations which are
2017 children of this association.
2018
2019 GrpNodes=
2020 Maximum number of nodes running jobs are able to be allocated in
2021 aggregate for this association and all associations which are
2022 children of this association.
2023
2024 NOTE: Each job's node allocation is counted separately (i.e. if a sin‐
2025 gle node has resources allocated to two jobs, this is counted as two
2026 allocated nodes).
2027
2028 GrpSubmitJobs=
2029 Maximum number of jobs which can be in a pending or running
2030 state at any time in aggregate for this association and all
2031 associations which are children of this association.
2032
2033 GrpWall=
2034 Maximum wall clock time running jobs are able to be allocated in
2035 aggregate for this association and all associations which are
2036 children of this association.
2037
2038 FairShare=
2039 Number used in conjunction with other associations to determine
2040 job priority.
2041
2042 MaxJobs=
2043 Maximum number of jobs the children of this association can run.
2044
2045 MaxNodesPerJob=
2046 Maximum number of nodes per job the children of this association
2047 can run.
2048
2049 MaxWallDurationPerJob=
2050 Maximum time (not related to job size) children of this accounts
2051 jobs can run.
2052
2053 Organization=
2054 Name of organization that owns this account.
2055
2056 QOS(=,+=,-=)
2057 Comma separated list of Quality of Service names (Defined in
2058 sacctmgr).
2059
2060
2061
2062 To add users to a account add a line like this after a Parent -
2063 line
2064 Parent - test
2065 User - adam:MaxNodesPerJob=2:MaxJobs=3:Fair‐
2066 Share=1:MaxWallDurationPerJob=1:AdminLevel=Operator:Coor‐
2067 dinator='test'
2068
2069
2070 All user options are
2071
2072 AdminLevel=
2073 Type of admin this user is (Administrator, Operator)
2074 Must be defined on the first occurrence of the user.
2075
2076 Coordinator=
2077 Comma separated list of accounts this user is coordinator
2078 over
2079 Must be defined on the first occurrence of the user.
2080
2081 DefaultAccount=
2082 system wide default account name
2083 Must be defined on the first occurrence of the user.
2084
2085 FairShare=
2086 Number used in conjunction with other associations to
2087 determine job priority.
2088
2089 MaxJobs=
2090 Maximum number of jobs this user can run.
2091
2092 MaxNodesPerJob=
2093 Maximum number of nodes per job this user can run.
2094
2095 MaxWallDurationPerJob=
2096 Maximum time (not related to job size) this user can run.
2097
2098 QOS(=,+=,-=)
2099 Comma separated list of Quality of Service names (Defined
2100 in sacctmgr).
2101
2102
2103
2105 Sacctmgr has the capability to archive to a flatfile and or load
2106 that data if needed later. The archiving is usually done by the
2107 slurmdbd and it is highly recommended you only do it through
2108 sacctmgr if you completely understand what you are doing. For
2109 slurmdbd options see "man slurmdbd" for more information. Load‐
2110 ing data into the database can be done from these files to
2111 either view old data or regenerate rolled up data.
2112
2113 These are the options for both dump and load of archive informa‐
2114 tion.
2115
2116 archive dump
2117
2118
2119 Directory=
2120 Directory to store the archive data.
2121
2122 Events Archive Events. If not specified and PurgeEventAfter is
2123 set all event data removed will be lost permanently.
2124
2125 Jobs Archive Jobs. If not specified and PurgeJobAfter is set
2126 all job data removed will be lost permanently.
2127
2128 PurgeEventAfter=
2129 Purge cluster event records older than time stated in
2130 months. If you want to purge on a shorter time period
2131 you can include hours, or days behind the numeric value
2132 to get those more frequent purges. (e.g. a value of
2133 '12hours' would purge everything older than 12 hours.)
2134
2135 PurgeJobAfter=
2136 Purge job records older than time stated in months. If
2137 you want to purge on a shorter time period you can
2138 include hours, or days behind the numeric value to get
2139 those more frequent purges. (e.g. a value of '12hours'
2140 would purge everything older than 12 hours.)
2141
2142 PurgeStepAfter=
2143 Purge step records older than time stated in months. If
2144 you want to purge on a shorter time period you can
2145 include hours, or days behind the numeric value to get
2146 those more frequent purges. (e.g. a value of '12hours'
2147 would purge everything older than 12 hours.)
2148
2149 PurgeSuspendAfter=
2150 Purge job suspend records older than time stated in
2151 months. If you want to purge on a shorter time period
2152 you can include hours, or days behind the numeric value
2153 to get those more frequent purges. (e.g. a value of
2154 '12hours' would purge everything older than 12 hours.)
2155
2156 Script=
2157 Run this script instead of the generic form of archive to
2158 flat files.
2159
2160 Steps Archive Steps. If not specified and PurgeStepAfter is
2161 set all step data removed will be lost permanently.
2162
2163 Suspend
2164 Archive Suspend Data. If not specified and PurgeSus‐
2165 pendAfter is set all suspend data removed will be lost
2166 permanently.
2167
2168
2169 Archive Load
2170 Load in to the database previously archived data.
2171
2172
2173 File= File to load into database.
2174
2175 Insert=
2176 SQL to insert directly into the database. This should be
2177 used very cautiously since this is writing your sql into
2178 the database.
2179
2180
2182 Some sacctmgr options may be set via environment variables.
2183 These environment variables, along with their corresponding
2184 options, are listed below. (Note: commandline options will
2185 always override these settings)
2186
2187 SLURM_CONF The location of the Slurm configuration
2188 file.
2189
2190
2192 NOTE: There is an order to set up accounting associations. You
2193 must define clusters before you add accounts and you must add
2194 accounts before you can add users.
2195
2196 -> sacctmgr create cluster tux
2197 -> sacctmgr create account name=science fairshare=50
2198 -> sacctmgr create account name=chemistry parent=science fair‐
2199 share=30
2200 -> sacctmgr create account name=physics parent=science fair‐
2201 share=20
2202 -> sacctmgr create user name=adam cluster=tux account=physics
2203 fairshare=10
2204 -> sacctmgr delete user name=adam cluster=tux account=physics
2205 -> sacctmgr delete account name=physics cluster=tux
2206 -> sacctmgr modify user where name=adam cluster=tux
2207 account=physics set
2208 maxjobs=2 maxwall=30:00
2209 -> sacctmgr add user brian account=chemistry
2210 -> sacctmgr list associations cluster=tux format=Account,Clus‐
2211 ter,User,Fairshare tree withd
2212 -> sacctmgr list transactions StartTime=11/03\-10:30:00 for‐
2213 mat=Timestamp,Action,Actor
2214 -> sacctmgr dump cluster=tux file=tux_data_file
2215 -> sacctmgr load tux_data_file
2216
2217 A user's account can not be changed directly. A new association
2218 needs to be created for the user with the new account. Then the
2219 association with the old account can be deleted.
2220
2221 When modifying an object placing the key words 'set' and the
2222 optional 'where' is critical to perform correctly below are
2223 examples to produce correct results. As a rule of thumb any‐
2224 thing you put in front of the set will be used as a quantifier.
2225 If you want to put a quantifier after the key word 'set' you
2226 should use the key word 'where'.
2227
2228 wrong-> sacctmgr modify user name=adam set fairshare=10 clus‐
2229 ter=tux
2230
2231 This will produce an error as the above line reads modify user
2232 adam set fairshare=10 and cluster=tux.
2233
2234 right-> sacctmgr modify user name=adam cluster=tux set fair‐
2235 share=10
2236 right-> sacctmgr modify user name=adam set fairshare=10 where
2237 cluster=tux
2238
2239 When changing qos for something only use the '=' operator when
2240 wanting to explicitly set the qos to something. In most cases
2241 you will want to use the '+=' or '\-=' operator to either add to
2242 or remove from the existing qos already in place.
2243
2244 If a user already has qos of normal,standby for a parent or it
2245 was explicitly set you should use qos+=expedite to add this to
2246 the list in this fashion.
2247
2248 If you are looking to only add the qos expedite to only a cer‐
2249 tain account and or cluster you can do that by specifying them
2250 in the sacctmgr line.
2251
2252 -> sacctmgr modify user name=adam set qos+=expedite
2253
2254 > sacctmgr modify user name=adam acct=this cluster=tux set
2255 qos+=expedite
2256
2257 Let's give an example how to add QOS to user accounts. List all
2258 available QOSs in the cluster.
2259
2260 ->sacctmgr show qos format=name
2261 Name
2262 ---------
2263 normal
2264 expedite
2265
2266 List all the associations in the cluster.
2267
2268 ->sacctmgr show assoc format=cluster,account,qos
2269 Cluster Account QOS
2270 -------- ---------- -----
2271 zebra root normal
2272 zebra root normal
2273 zebra g normal
2274 zebra g1 normal
2275
2276 Add the QOS expedite to account G1 and display the result.
2277 Using the operator += the QOS will be added together with the
2278 existing QOS to this account.
2279
2280 ->sacctmgr modify account name=g1 set qos+=expedite
2281
2282 ->sacctmgr show assoc format=cluster,account,qos
2283 Cluster Account QOS
2284 -------- -------- -------
2285 zebra root normal
2286 zebra root normal
2287 zebra g normal
2288 zebra g1 expedite,normal
2289
2290 Now set the QOS expedite as the only QOS for the account G and
2291 display the result. Using the operator = that expedite is the
2292 only usable QOS by account G
2293
2294 ->sacctmgr modify account name=G set qos=expedite
2295
2296 >sacctmgr show assoc format=cluster,account,user,qos
2297 Cluster Account QOS
2298 --------- -------- -----
2299 zebra root normal
2300 zebra root normal
2301 zebra g expedite
2302 zebra g1 expedite,normal
2303
2304 If a new account is added under the account G it will inherit
2305 the QOS expedite and it will not have access to QOS normal.
2306
2307 ->sacctmgr add account banana parent=G
2308
2309 ->sacctmgr show assoc format=cluster,account,qos
2310 Cluster Account QOS
2311 --------- -------- -----
2312 zebra root normal
2313 zebra root normal
2314 zebra g expedite
2315 zebra banana expedite
2316 zebra g1 expedite,normal
2317
2318 An example of listing trackable resources
2319
2320 ->sacctmgr show tres
2321 Type Name ID
2322 ---------- ----------------- --------
2323 cpu 1
2324 mem 2
2325 energy 3
2326 node 4
2327 billing 5
2328 gres gpu:tesla 1001
2329 license vcs 1002
2330 bb cray 1003
2331
2332
2333
2335 Copyright (C) 2008-2010 Lawrence Livermore National Security.
2336 Produced at Lawrence Livermore National Laboratory (cf, DIS‐
2337 CLAIMER).
2338 Copyright (C) 2010-2016 SchedMD LLC.
2339
2340 This file is part of Slurm, a resource management program. For
2341 details, see <https://slurm.schedmd.com/>.
2342
2343 Slurm is free software; you can redistribute it and/or modify it
2344 under the terms of the GNU General Public License as published
2345 by the Free Software Foundation; either version 2 of the
2346 License, or (at your option) any later version.
2347
2348 Slurm is distributed in the hope that it will be useful, but
2349 WITHOUT ANY WARRANTY; without even the implied warranty of MER‐
2350 CHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
2351 General Public License for more details.
2352
2353
2355 slurm.conf(5), slurmdbd(8)
2356
2357
2358
2359June 2018 Slurm Commands sacctmgr(1)