1slurmdbd.conf(5) Slurm Configuration File slurmdbd.conf(5)
2
3
4
6 slurmdbd.conf - Slurm Database Daemon (SlurmDBD) configuration file
7
8
10 slurmdb.conf is an ASCII file which describes Slurm Database Daemon
11 (SlurmDBD) configuration information. The file location can be modi‐
12 fied at system build time using the DEFAULT_SLURM_CONF parameter or at
13 execution time by setting the SLURM_CONF environment variable.
14
15 The contents of the file are case insensitive except for the names of
16 nodes and files. Any text following a "#" in the configuration file is
17 treated as a comment through the end of that line. Changes to the con‐
18 figuration file take effect upon restart of SlurmDbd or daemon receipt
19 of the SIGHUP signal unless otherwise noted.
20
21 This file should be only on the computer where SlurmDBD executes and
22 should only be readable by the user which executes SlurmDBD (e.g.
23 "slurm"). If the slurmdbd daemon is started as user root and changes
24 to another user ID, the configuration file will initially be read as
25 user root, but will be read as the other user ID in response to a
26 SIGHUP signal. This file should be protected from unauthorized access
27 since it contains a database password. The overall configuration
28 parameters available include:
29
30
31 ArchiveDir
32 If ArchiveScript is not set the slurmdbd will generate a file
33 that can be read in anytime with sacctmgr load filename. This
34 directory is where the file will be placed after a purge event
35 has happened and archive for that element is set to true.
36 Default is /tmp. The format for this files name is
37 $ArchiveDir/$ClusterName_$ArchiveObject_archive_$BeginTimeS‐
38 tamp_$endTimeStamp
39
40
41 ArchiveEvents
42 When purging events also archive them. Boolean, yes to archive
43 event data, no otherwise. Default is no.
44
45
46 ArchiveJobs
47 When purging jobs also archive them. Boolean, yes to archive
48 job data, no otherwise. Default is no.
49
50
51 ArchiveResvs
52 When purging reservations also archive them. Boolean, yes to
53 archive reservation data, no otherwise. Default is no.
54
55
56 ArchiveScript
57 This script can be executed every time a rollup happens (every
58 hour, day and month), depending on the Purge*After options.
59 This script is used to transfer accounting records out of the
60 database into an archive. It is used in place of the internal
61 process used to archive objects. The script is executed with a
62 no arguments, The following environment variables are set.
63
64 SLURM_ARCHIVE_EVENTS
65 1 for archive events 0 otherwise.
66
67 SLURM_ARCHIVE_LAST_EVENT
68 Time of last event start to archive.
69
70 SLURM_ARCHIVE_JOBS
71 1 for archive jobs 0 otherwise.
72
73 SLURM_ARCHIVE_LAST_JOB
74 Time of last job submit to archive.
75
76 SLURM_ARCHIVE_STEPS
77 1 for archive steps 0 otherwise.
78
79 SLURM_ARCHIVE_LAST_STEP
80 Time of last step start to archive.
81
82 SLURM_ARCHIVE_SUSPEND
83 1 for archive suspend data 0 otherwise.
84
85 SLURM_ARCHIVE_TXN
86 1 for archive transaction data 0 otherwise.
87
88 SLURM_ARCHIVE_USAGE
89 1 for archive usage data 0 otherwise.
90
91 SLURM_ARCHIVE_LAST_SUSPEND
92 Time of last suspend start to archive.
93
94
95
96 ArchiveSteps
97 When purging steps also archive them. Boolean,
98 yes to archive step data, no otherwise. Default
99 is no.
100
101
102 ArchiveSuspend
103 When purging suspend data also archive it. Bool‐
104 ean, yes to archive suspend data, no otherwise.
105 Default is no.
106
107
108 ArchiveTXN
109 When purging transaction data also archive it.
110 Boolean, yes to archive transaction data, no oth‐
111 erwise. Default is no.
112
113
114 ArchiveUsage
115 When purging usage data (Cluster, Association and
116 WCKey) also archive it. Boolean, yes to archive
117 transaction data, no otherwise. Default is no.
118
119
120 AuthInfo
121 Additional information to be used for authentica‐
122 tion of communications with the Slurm control dae‐
123 mon (slurmctld) on each cluster. The interpreta‐
124 tion of this option is specific to the configured
125 AuthType. In the case of auth/munge, this can be
126 configured to use a Munge daemon specifically con‐
127 figured to provide authentication between clusters
128 while the default Munge daemon provides authenti‐
129 cation within a cluster. In that case, this will
130 specify the pathname of the socket to use. Per
131 default this value is left unspecified, which
132 results in the default authentication mechanism
133 being used.
134
135
136 AuthType
137 Define the authentication method for communica‐
138 tions between Slurm components. Acceptable values
139 at present include "auth/none" and "auth/munge".
140 The default value is "auth/munge". Do not use
141 "auth/none" if you desire any security.
142 "auth/munge" indicates that LLNL's MUNGE system is
143 to be used (this is the supported authentication
144 mechanism for Slurm; see
145 "https://dun.github.io/munge/" for more informa‐
146 tion). SlurmDBD must be terminated prior to
147 changing the value of AuthType and later
148 restarted.
149
150
151 CommitDelay
152 How many seconds between commits on a connection
153 from a Slurmctld. This speeds up inserts into the
154 database dramatically. If you are running a very
155 high throughput of jobs you should consider set‐
156 ting this. In testing, 1 second improves the
157 slurmdbd performance dramatically and reduces
158 overhead. There is a small probability of data
159 loss though since this creates a window in which
160 if the slurmdbd seg faults or exits abnormally for
161 any reason the data not committed could be lost.
162 While this situation should be very rare, it does
163 present an extremely small risk, but may be the
164 only way to run in extremely heavy environments.
165 In all honesty, the risk is quite low, but still
166 present.
167
168
169 DbdBackupHost
170 The short, or long, name of the machine where the
171 backup Slurm Database Daemon is executed (i.e. the
172 name returned by the command "hostname -s"). This
173 host must have access to the same underlying data‐
174 base specified by the 'Storage' options mentioned
175 below.
176
177
178 DbdAddr
179 Name that DbdHost should be referred to in estab‐
180 lishing a communications path. This name will be
181 used as an argument to the gethostbyname() func‐
182 tion for identification. For example, "elx0000"
183 might be used to designate the Ethernet address
184 for node "lx0000". By default the DbdAddr will be
185 identical in value to DbdHost.
186
187
188 DbdHost
189 The short, or long, name of the machine where the
190 Slurm Database Daemon is executed (i.e. the name
191 returned by the command "hostname -s"). This
192 value must be specified.
193
194
195 DbdPort
196 The port number that the Slurm Database Daemon
197 (slurmdbd) listens to for work. The default value
198 is SLURMDBD_PORT as established at system build
199 time. If none is explicitly specified, it will be
200 set to 6819. This value must be equal to the
201 AccountingStoragePort parameter in the slurm.conf
202 file.
203
204
205 DebugFlags
206 Defines specific subsystems which should provide
207 more detailed event logging. Multiple subsystems
208 can be specified with comma separators. Most
209 DebugFlags will result in verbose logging for the
210 identified subsystems and could impact perfor‐
211 mance. Valid subsystems available today (with
212 more to come) include:
213
214 DB_ARCHIVE SQL statements/queries when deal‐
215 ing with archiving and purging
216 the database.
217
218 DB_ASSOC SQL statements/queries when deal‐
219 ing with associations in the
220 database.
221
222 DB_EVENT SQL statements/queries when deal‐
223 ing with (node) events in the
224 database.
225
226 DB_JOB SQL statements/queries when deal‐
227 ing with jobs in the database.
228
229 DB_QOS SQL statements/queries when deal‐
230 ing with QOS in the database.
231
232 DB_QUERY SQL statements/queries when deal‐
233 ing with transactions and such in
234 the database.
235
236 DB_RESERVATION SQL statements/queries when deal‐
237 ing with reservations in the
238 database.
239
240 DB_RESOURCE SQL statements/queries when deal‐
241 ing with resources like licenses
242 in the database.
243
244 DB_STEP SQL statements/queries when deal‐
245 ing with steps in the database.
246
247 DB_USAGE SQL statements/queries when deal‐
248 ing with usage queries and
249 inserts in the database.
250
251 DB_WCKEY SQL statements/queries when deal‐
252 ing with wckeys in the database.
253
254 FEDERATION SQL statements/queries when deal‐
255 ing with federations in the data‐
256 base.
257
258
259 DebugLevel
260 The level of detail to provide the Slurm Database
261 Daemon's logs. The default value is info.
262
263 quiet Log nothing
264
265 fatal Log only fatal errors
266
267 error Log only errors
268
269 info Log errors and general informational
270 messages
271
272 verbose Log errors and verbose informational
273 messages
274
275 debug Log errors and verbose informational
276 messages and debugging messages
277
278 debug2 Log errors and verbose informational
279 messages and more debugging messages
280
281 debug3 Log errors and verbose informational
282 messages and even more debugging mes‐
283 sages
284
285 debug4 Log errors and verbose informational
286 messages and even more debugging mes‐
287 sages
288
289 debug5 Log errors and verbose informational
290 messages and even more debugging mes‐
291 sages
292
293
294 DebugLevelSyslog
295 The slurmdbd daemon will log events to the syslog
296 file at the specified level of detail. If not set,
297 the slurmdbd daemon will log to syslog at level
298 fatal, unless there is no LogFile and it is run‐
299 ning in the background, in which case it will log
300 to syslog at the level specified by DebugLevel (at
301 fatal in the case that DebugLevel is set to quiet)
302 or it is run in the foreground, when it will be
303 set to quiet.
304
305
306 quiet Log nothing
307
308 fatal Log only fatal errors
309
310 error Log only errors
311
312 info Log errors and general informational
313 messages
314
315 verbose Log errors and verbose informational
316 messages
317
318 debug Log errors and verbose informational
319 messages and debugging messages
320
321 debug2 Log errors and verbose informational
322 messages and more debugging messages
323
324 debug3 Log errors and verbose informational
325 messages and even more debugging mes‐
326 sages
327
328 debug4 Log errors and verbose informational
329 messages and even more debugging mes‐
330 sages
331
332 debug5 Log errors and verbose informational
333 messages and even more debugging mes‐
334 sages
335
336
337
338 DefaultQOS
339 When adding a new cluster this will be used as the
340 qos for the cluster unless something is explicitly
341 set by the admin with the create.
342
343
344 LogFile
345 Fully qualified pathname of a file into which the
346 Slurm Database Daemon's logs are written. The
347 default value is none (performs logging via sys‐
348 log).
349 See the section LOGGING in the slurm.conf man page
350 if a pathname is specified.
351
352
353 LogTimeFormat
354 Format of the timestamp in slurmdbd log files.
355 Accepted values are "iso8601", "iso8601_ms",
356 "rfc5424", "rfc5424_ms", "clock", and "short". The
357 values ending in "_ms" differ from the ones with‐
358 out in that fractional seconds with millisecond
359 precision are printed. The default value is
360 "iso8601_ms". The "rfc5424" formats are the same
361 as the "iso8601" formats except that the timezone
362 value is also shown. The "clock" format shows a
363 timestamp in microseconds retrieved with the C
364 standard clock() function. The "short" format is a
365 short date and time format. The "thread_id" format
366 shows the timestamp in the C standard ctime()
367 function form without the year but including the
368 microseconds, the daemon's process ID and the cur‐
369 rent thread ID.
370
371
372 MaxQueryTimeRange
373 Return an error if a query is against too large of
374 a time span, to prevent ill-formed queries from
375 causing performance problems within SlurmDBD.
376 Default value is INFINITE which allows any queries
377 to proceed. Accepted time formats are the same as
378 the MaxTime option in slurm.conf. User SlurmUser
379 and root are exempt from this restriction. Note
380 that queries which attempt to return over 3GB of
381 data will still fail to complete with
382 ESLURM_RESULT_TOO_LARGE.
383
384
385 MessageTimeout
386 Time permitted for a round-trip communication to
387 complete in seconds. Default value is 10 seconds.
388
389
390 Parameters
391 Contains arbitrary comma separated parameters used
392 to alter the behavior of the slurmdbd.
393
394 PreserveCaseUser
395 When defining users do not force lower case
396 which is the default behavior.
397
398
399 PidFile
400 Fully qualified pathname of a file into which the
401 Slurm Database Daemon may write its process ID.
402 This may be used for automated signal processing.
403 The default value is "/var/run/slurmdbd.pid".
404
405
406 PluginDir
407 Identifies the places in which to look for Slurm
408 plugins. This is a colon-separated list of direc‐
409 tories, like the PATH environment variable. The
410 default value is "/usr/local/lib/slurm".
411
412
413 PrivateData
414 This controls what type of information is hidden
415 from regular users. By default, all information
416 is visible to all users. User SlurmUser, root,
417 and users with AdminLevel=Admin can always view
418 all information. Multiple values may be specified
419 with a comma separator. Acceptable values
420 include:
421
422 accounts
423 prevents users from viewing any account
424 definitions unless they are coordinators of
425 them.
426
427 events prevents users from viewing event informa‐
428 tion unless they have operator status or
429 above.
430
431 jobs prevents users from viewing job records
432 belonging to other users unless they are
433 coordinators of the association running the
434 job when using sacct.
435
436 reservations
437 restricts getting reservation information
438 to users with operator status and above.
439
440 usage prevents users from viewing usage of any
441 other user. This applys to sreport.
442
443 users prevents users from viewing information of
444 any user other than themselves, this also
445 makes it so users can only see associations
446 they deal with. Coordinators can see asso‐
447 ciations of all users they are coordinator
448 of, but can only see themselves when list‐
449 ing users.
450
451
452 PurgeEventAfter
453 Events happening on the cluster over this age are
454 purged from the database. This includes node down
455 times and such. The time is a numeric value and
456 is a number of months. If you want to purge more
457 often you can include "hours", or "days" behind
458 the numeric value to get those more frequent
459 purges (i.e. a value of "12hours" would purge
460 everything older than 12 hours). The purge takes
461 place at the start of the each purge interval.
462 For example, if the purge time is 2 months, the
463 purge would happen at the beginning of each month.
464 If not set (default), then job step records are
465 never purged.
466
467
468 PurgeJobAfter
469 Individual job records over this age are purged
470 from the database. Aggregated information will be
471 preserved to "PurgeUsageAfter". The time is a
472 numeric value and is a number of months. If you
473 want to purge more often you can include "hours",
474 or "days" behind the numeric value to get those
475 more frequent purges (i.e. a value of "12hours"
476 would purge everything older than 12 hours). The
477 purge takes place at the start of the each purge
478 interval. For example, if the purge time is 2
479 months, the purge would happen at the beginning of
480 each month. If not set (default), then job
481 records are never purged.
482
483
484 PurgeResvAfter
485 Individual reservation records over this age are
486 purged from the database. Aggregated information
487 will be preserved to "PurgeUsageAfter". The time
488 is a numeric value and is a number of months. If
489 you want to purge more often you can include
490 "hours", or "days" behind the numeric value to get
491 those more frequent purges (i.e. a value of
492 "12hours" would purge everything older than 12
493 hours). The purge takes place at the start of the
494 each purge interval. For example, if the purge
495 time is 2 months, the purge would happen at the
496 beginning of each month. If not set (default),
497 then reservation records are never purged.
498
499
500 PurgeStepAfter
501 Individual job step records over this age are
502 purged from the database. Aggregated information
503 will be preserved to "PurgeUsageAfter". The time
504 is a numeric value and is a number of months. If
505 you want to purge more often you can include
506 "hours", or "days" behind the numeric value to get
507 those more frequent purges (i.e. a value of
508 "12hours" would purge everything older than 12
509 hours). The purge takes place at the start of the
510 each purge interval. For example, if the purge
511 time is 2 months, the purge would happen at the
512 beginning of each month. If not set (default),
513 then job step records are never purged.
514
515
516 PurgeSuspendAfter
517 Records of individual suspend times for jobs over
518 this age are purged from the database. Aggregated
519 information will be preserved to
520 "PurgeUsageAfter". The time is a numeric value
521 and is a number of months. If you want to purge
522 more often you can include "hours", or "days"
523 behind the numeric value to get those more fre‐
524 quent purges (i.e. a value of "12hours" would
525 purge everything older than 12 hours). The purge
526 takes place at the start of the each purge inter‐
527 val. For example, if the purge time is 2 months,
528 the purge would happen at the beginning of each
529 month. If not set (default), then job step
530 records are never purged.
531
532
533 PurgeTXNAfter
534 Records of individual transaction times for trans‐
535 actions over this age are purged from the data‐
536 base. The time is a numeric value and is a number
537 of months. If you want to purge more often you
538 can include "hours", or "days" behind the numeric
539 value to get those more frequent purges (i.e. a
540 value of "12hours" would purge everything older
541 than 12 hours). The purge takes place at the
542 start of the each purge interval. For example, if
543 the purge time is 2 months, the purge would happen
544 at the beginning of each month. If not set
545 (default), then job step records are never purged.
546
547
548 PurgeUsageAfter
549 Usage Records (Cluster, Association and WCKey)
550 over this age are purged from the database. The
551 time is a numeric value and is a number of months.
552 If you want to purge more often you can include
553 "hours", or "days" behind the numeric value to get
554 those more frequent purges (i.e. a value of
555 "12hours" would purge everything older than 12
556 hours). The purge takes place at the start of the
557 each purge interval. For example, if the purge
558 time is 2 months, the purge would happen at the
559 beginning of each month. If not set (default),
560 then job step records are never purged.
561
562
563 SlurmUser
564 The name of the user that the slurmctld daemon
565 executes as. This user must exist on the machine
566 executing the Slurm Database Daemon and have the
567 same user ID as the hosts on which slurmctld exe‐
568 cute. For security purposes, a user other than
569 "root" is recommended. The default value is
570 "root".
571
572
573 StorageHost
574 Define the name of the host the database is run‐
575 ning where we are going to store the data. Ide‐
576 ally this should be the host on which slurmdbd
577 executes.
578
579
580 StorageBackupHost
581 Define the name of the backup host the database is
582 running where we are going to store the data.
583 This can be viewed as a backup solution when the
584 StorageHost is not responding. It is up to the
585 backup solution to enforce the coherency of the
586 accounting information between the two hosts. With
587 clustered database solutions (active/passive HA),
588 you would not need to use this feature. Default
589 is none.
590
591
592 StorageLoc
593 Specify the name of the database as the location
594 where accounting records are written. Defaults to
595 "slurm_acct_db".
596
597
598 StoragePass
599 Define the password used to gain access to the
600 database to store the job accounting data. The '#'
601 character is not permitted in a password.
602
603
604 StoragePort
605 The port number that the Slurm Database Daemon
606 (slurmdbd) communicates with the database.
607
608
609 StorageType
610 Define the accounting storage mechanism type.
611 Acceptable values at present include "account‐
612 ing_storage/mysql". The value "accounting_stor‐
613 age/mysql" indicates that accounting records
614 should be written to a MySQL or MariaDB database
615 specified by the StorageLoc parameter. This value
616 must be specified.
617
618
619 StorageUser
620 Define the name of the user we are going to con‐
621 nect to the database with to store the job
622 accounting data.
623
624
625 TCPTimeout
626 Time permitted for TCP connection to be estab‐
627 lished. Default value is 2 seconds.
628
629
630 TrackWCKey
631 Boolean yes or no. Used to set display and track
632 of the Workload Characterization Key. Must be set
633 to track wckey usage. This must be set to gener‐
634 ate rolled up usage tables from WCKeys. NOTE: If
635 TrackWCKey is set here and not in your various
636 slurm.conf files all jobs will be attributed to
637 their default WCKey.
638
639
640 TrackSlurmctldDown
641 Boolean yes or no. If set the slurmdbd will mark
642 all idle resources on the cluster as down when a
643 slurmctld disconnects or is no longer reachable.
644 The default is no.
645
646
648 #
649 # Sample /etc/slurmdbd.conf
650 #
651 ArchiveEvents=yes
652 ArchiveJobs=yes
653 ArchiveResvs=yes
654 ArchiveSteps=no
655 ArchiveSuspend=no
656 ArchiveTXN=no
657 ArchiveUsage=no
658 #ArchiveScript=/usr/sbin/slurm.dbd.archive
659 AuthInfo=/var/run/munge/munge.socket.2
660 AuthType=auth/munge
661 DbdHost=db_host
662 DebugLevel=info
663 PurgeEventAfter=1month
664 PurgeJobAfter=12month
665 PurgeResvAfter=1month
666 PurgeStepAfter=1month
667 PurgeSuspendAfter=1month
668 PurgeTXNAfter=12month
669 PurgeUsageAfter=24month
670 LogFile=/var/log/slurmdbd.log
671 PidFile=/var/tmp/jette/slurmdbd.pid
672 SlurmUser=slurm_mgr
673 StoragePass=shazaam
674 StorageType=accounting_storage/mysql
675 StorageUser=database_mgr
676
677
679 Copyright (C) 2008-2010 Lawrence Livermore National Secu‐
680 rity. Produced at Lawrence Livermore National Laboratory
681 (cf, DISCLAIMER).
682 Copyright (C) 2010-2014 SchedMD LLC.
683
684 This file is part of Slurm, a resource management pro‐
685 gram. For details, see <https://slurm.schedmd.com/>.
686
687 Slurm is free software; you can redistribute it and/or
688 modify it under the terms of the GNU General Public
689 License as published by the Free Software Foundation;
690 either version 2 of the License, or (at your option) any
691 later version.
692
693 Slurm is distributed in the hope that it will be useful,
694 but WITHOUT ANY WARRANTY; without even the implied war‐
695 ranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PUR‐
696 POSE. See the GNU General Public License for more
697 details.
698
699
701 /etc/slurmdbd.conf
702
703
705 slurm.conf(5), slurmctld(8), slurmdbd(8) syslog (2)
706
707
708
709August 2018 Slurm Configuration File slurmdbd.conf(5)