1slurmdbd.conf(5) Slurm Configuration File slurmdbd.conf(5)
2
3
4
6 slurmdbd.conf - Slurm Database Daemon (SlurmDBD) configuration file
7
8
10 slurmdbd.conf is an ASCII file which describes Slurm Database Daemon
11 (SlurmDBD) configuration information. The file will always be located
12 in the same directory as the slurm.conf.
13
14 The contents of the file are case insensitive except for the names of
15 nodes and files. Any text following a "#" in the configuration file is
16 treated as a comment through the end of that line. Changes to the con‐
17 figuration file take effect upon restart of SlurmDBD or daemon receipt
18 of the SIGHUP signal unless otherwise noted.
19
20 This file should be only on the computer where SlurmDBD executes and
21 should only be readable by the user which executes SlurmDBD (e.g.
22 "slurm"). If the slurmdbd daemon is started as user root and changes
23 to another user ID, the configuration file will initially be read as
24 user root, but will be read as the other user ID in response to a
25 SIGHUP signal. This file should be protected from unauthorized access
26 since it contains a database password. The overall configuration pa‐
27 rameters available include:
28
29
30 ArchiveDir
31 If ArchiveScript is not set the slurmdbd will generate a file
32 that can be read in anytime with sacctmgr load filename. This
33 directory is where the file will be placed after a purge event
34 has happened and archive for that element is set to true. De‐
35 fault is /tmp. The format for this files name is
36 $ArchiveDir/$ClusterName_$ArchiveObject_archive_$BeginTimeS‐
37 tamp_$endTimeStamp We limit archive files to 50000 records per
38 file. If more than 50000 records exist during that time period,
39 they will be written to a new file. Subsequent archive files
40 during the same time period will have ".<number>" appended to
41 the file, for example .2, with the number increasing by one for
42 each file in the same time period.
43
44 ArchiveEvents
45 When purging events also archive them. Boolean, yes to archive
46 event data, no otherwise. Default is no.
47
48 ArchiveJobs
49 When purging jobs also archive them. Boolean, yes to archive
50 job data, no otherwise. Default is no.
51
52 ArchiveResvs
53 When purging reservations also archive them. Boolean, yes to
54 archive reservation data, no otherwise. Default is no.
55
56 ArchiveScript
57 This script can be executed every time a rollup happens (every
58 hour, day and month), depending on the Purge*After options.
59 This script is used to transfer accounting records out of the
60 database into an archive. It is used in place of the internal
61 process used to archive objects. The script is executed with no
62 arguments, and the following environment variables are set.
63
64 SLURM_ARCHIVE_EVENTS
65 1 for archive events 0 otherwise.
66
67 SLURM_ARCHIVE_LAST_EVENT
68 Time of last event start to archive.
69
70 SLURM_ARCHIVE_JOBS
71 1 for archive jobs 0 otherwise.
72
73 SLURM_ARCHIVE_LAST_JOB
74 Time of last job submit to archive.
75
76 SLURM_ARCHIVE_STEPS
77 1 for archive steps 0 otherwise.
78
79 SLURM_ARCHIVE_LAST_STEP
80 Time of last step start to archive.
81
82 SLURM_ARCHIVE_SUSPEND
83 1 for archive suspend data 0 otherwise.
84
85 SLURM_ARCHIVE_TXN
86 1 for archive transaction data 0 otherwise.
87
88 SLURM_ARCHIVE_USAGE
89 1 for archive usage data 0 otherwise.
90
91 SLURM_ARCHIVE_LAST_SUSPEND
92 Time of last suspend start to archive.
93
94 ArchiveSteps
95 When purging steps also archive them. Boolean, yes to archive
96 step data, no otherwise. Default is no.
97
98 ArchiveSuspend
99 When purging suspend data also archive it. Boolean, yes to ar‐
100 chive suspend data, no otherwise. Default is no.
101
102 ArchiveTXN
103 When purging transaction data also archive it. Boolean, yes to
104 archive transaction data, no otherwise. Default is no.
105
106 ArchiveUsage
107 When purging usage data (Cluster, Association and WCKey) also
108 archive it. Boolean, yes to archive transaction data, no other‐
109 wise. Default is no.
110
111 AuthInfo
112 Additional information to be used for authentication of communi‐
113 cations with the Slurm control daemon (slurmctld) on each clus‐
114 ter. The interpretation of this option is specific to the con‐
115 figured AuthType. In the case of auth/munge, this can be con‐
116 figured to use a Munge daemon specifically configured to provide
117 authentication between clusters while the default Munge daemon
118 provides authentication within a cluster. In that case, this
119 will specify the pathname of the socket to use. Per default this
120 value is left unspecified, which results in the default authen‐
121 tication mechanism being used.
122
123 AuthAltTypes
124 Command separated list of alternative authentication plugins
125 that the slurmdbd will permit for communication.
126
127 AuthAltParameters
128 Used to define alternative authentication plugins options. Mul‐
129 tiple options may be comma separated.
130
131 jwks= Absolute path to JWKS file. Only RS256 keys are sup‐
132 ported, although other key types may be listed in the
133 file. If set, no HS256 key will be loaded by default (and
134 token generation is disabled), although the jwt_key set‐
135 ting may be used to explicitly re-enable HS256 key use
136 (and token generation).
137
138 jwt_key=
139 Absolute path to JWT key file. Key must be HS256, and
140 should only be accessible by SlurmUser.
141
142 AuthType
143 Define the authentication method for communications between
144 Slurm components. Acceptable values at present include
145 "auth/munge", which is the default. "auth/munge" indicates that
146 LLNL's MUNGE system is to be used (this is the supported authen‐
147 tication mechanism for Slurm; see "https://dun.github.io/munge/"
148 for more information). SlurmDBD must be terminated prior to
149 changing the value of AuthType and later restarted.
150
151 CommitDelay
152 How many seconds between commits on a connection from a Slurm‐
153 ctld. This speeds up inserts into the database dramatically.
154 If you are running a very high throughput of jobs you should
155 consider setting this. In testing, 1 second improves the slur‐
156 mdbd performance dramatically and reduces overhead. There is a
157 small probability of data loss though since this creates a win‐
158 dow in which if the slurmdbd seg faults or exits abnormally for
159 any reason the data not committed could be lost. While this
160 situation should be very rare, it does present an extremely
161 small risk, but may be the only way to run in extremely heavy
162 environments. In all honesty, the risk is quite low, but still
163 present.
164
165 CommunicationParameters
166 Comma separated options identifying communication options.
167
168 DisableIPv4 Disable IPv4 only operation for the slurmdbd.
169 This should also be set in your slurm.conf file.
170
171 EnableIPv6 Enable using IPv6 addresses for the slurmdbd.
172 When using both IPv4 and IPv6, address family
173 preferences will be based on your /etc/gai.conf
174 file. This should also be set in your slurm.conf
175 file.
176
177 DbdBackupHost
178 The short, or long, name of the machine where the backup Slurm
179 Database Daemon is executed (i.e. the name returned by the com‐
180 mand "hostname -s"). This host must have access to the same un‐
181 derlying database specified by the 'Storage' options mentioned
182 below.
183
184 DbdAddr
185 Name that DbdHost should be referred to in establishing a commu‐
186 nications path. This name will be used as an argument to the
187 getaddrinfo() function for identification. For example,
188 "elx0000" might be used to designate the Ethernet address for
189 node "lx0000". By default the DbdAddr will be identical in
190 value to DbdHost.
191
192 DbdHost
193 The short, or long, name of the machine where the Slurm Database
194 Daemon is executed (i.e. the name returned by the command "host‐
195 name -s"). This value must be specified.
196
197 DbdPort
198 The port number that the Slurm Database Daemon (slurmdbd) lis‐
199 tens to for work. The default value is SLURMDBD_PORT as estab‐
200 lished at system build time. If no value is explicitly speci‐
201 fied, it will be set to 6819. This value must be equal to the
202 AccountingStoragePort parameter in the slurm.conf file.
203
204 DebugFlags
205 Defines specific subsystems which should provide more detailed
206 event logging. Multiple subsystems can be specified with comma
207 separators. Most DebugFlags will result in verbose logging for
208 the identified subsystems and could impact performance. Valid
209 subsystems available today (with more to come) include:
210
211 DB_ARCHIVE
212 SQL statements/queries when dealing with archiving and
213 purging the database.
214
215 DB_ASSOC
216 SQL statements/queries when dealing with associations in
217 the database.
218
219 DB_EVENT
220 SQL statements/queries when dealing with (node) events in
221 the database.
222
223 DB_JOB SQL statements/queries when dealing with jobs in the
224 database.
225
226 DB_QOS SQL statements/queries when dealing with QOS in the data‐
227 base.
228
229 DB_QUERY
230 SQL statements/queries when dealing with transactions and
231 such in the database.
232
233 DB_RESERVATION
234 SQL statements/queries when dealing with reservations in
235 the database.
236
237 DB_RESOURCE
238 SQL statements/queries when dealing with resources like
239 licenses in the database.
240
241 DB_STEP
242 SQL statements/queries when dealing with steps in the
243 database.
244
245 DB_TRES
246 SQL statements/queries when dealing with trackable re‐
247 sources in the database.
248
249 DB_USAGE
250 SQL statements/queries when dealing with usage queries
251 and inserts in the database.
252
253 DB_WCKEY
254 SQL statements/queries when dealing with wckeys in the
255 database.
256
257 FEDERATION
258 SQL statements/queries when dealing with federations in
259 the database.
260
261 DebugLevel
262 The level of detail to provide the Slurm Database Daemon's logs.
263 The default value is info.
264
265 quiet Log nothing
266
267 fatal Log only fatal errors
268
269 error Log only errors
270
271 info Log errors and general informational messages
272
273 verbose Log errors and verbose informational messages
274
275 debug Log errors and verbose informational messages and de‐
276 bugging messages
277
278 debug2 Log errors and verbose informational messages and more
279 debugging messages
280
281 debug3 Log errors and verbose informational messages and even
282 more debugging messages
283
284 debug4 Log errors and verbose informational messages and even
285 more debugging messages
286
287 debug5 Log errors and verbose informational messages and even
288 more debugging messages
289
290 DebugLevelSyslog
291 The slurmdbd daemon will log events to the syslog file at the
292 specified level of detail. If not set, the slurmdbd daemon will
293 log to syslog at level fatal, unless there is no LogFile and it
294 is running in the background, in which case it will log to sys‐
295 log at the level specified by DebugLevel (at fatal in the case
296 that DebugLevel is set to quiet) or it is run in the foreground,
297 when it will be set to quiet.
298
299 quiet Log nothing
300
301 fatal Log only fatal errors
302
303 error Log only errors
304
305 info Log errors and general informational messages
306
307 verbose Log errors and verbose informational messages
308
309 debug Log errors and verbose informational messages and de‐
310 bugging messages
311
312 debug2 Log errors and verbose informational messages and more
313 debugging messages
314
315 debug3 Log errors and verbose informational messages and even
316 more debugging messages
317
318 debug4 Log errors and verbose informational messages and even
319 more debugging messages
320
321 debug5 Log errors and verbose informational messages and even
322 more debugging messages
323
324 NOTE: By default, Slurm's systemd service files start daemons in
325 the foreground with the -D option. This means that systemd will
326 capture stdout/stderr output and print that to syslog, indepen‐
327 dent of Slurm printing to syslog directly. To prevent systemd
328 from doing this, add "StandardOutput=null" and "StandardEr‐
329 ror=null" to the respective service files or override files.
330
331 DefaultQOS
332 When adding a new cluster this will be used as the qos for the
333 cluster unless something is explicitly set by the admin with the
334 create.
335
336 LogFile
337 Fully qualified pathname of a file into which the Slurm Database
338 Daemon's logs are written. The default value is none (performs
339 logging via syslog).
340 See the section LOGGING in the slurm.conf man page if a pathname
341 is specified.
342
343 LogTimeFormat
344 Format of the timestamp in slurmdbd log files. Accepted values
345 are "iso8601", "iso8601_ms", "rfc5424", "rfc5424_ms", "clock",
346 and "short". The values ending in "_ms" differ from the ones
347 without in that fractional seconds with millisecond precision
348 are printed. The default value is "iso8601_ms". The "rfc5424"
349 formats are the same as the "iso8601" formats except that the
350 timezone value is also shown. The "clock" format shows a time‐
351 stamp in microseconds retrieved with the C standard clock()
352 function. The "short" format is a short date and time format.
353 The "thread_id" format shows the timestamp in the C standard
354 ctime() function form without the year but including the mi‐
355 croseconds, the daemon's process ID and the current thread ID.
356
357 MaxQueryTimeRange
358 Return an error if a query is against too large of a time span,
359 to prevent ill-formed queries from causing performance problems
360 within SlurmDBD. Default value is INFINITE which allows any
361 queries to proceed. Accepted time formats are the same as the
362 MaxTime option in slurm.conf. Operator and higher privileged
363 users are exempt from this restriction. Note that queries which
364 attempt to return over 3GB of data will still fail to complete
365 with ESLURM_RESULT_TOO_LARGE.
366
367 MessageTimeout
368 Time permitted for a round-trip communication to complete in
369 seconds. Default value is 10 seconds.
370
371 Parameters
372 Contains arbitrary comma separated parameters used to alter the
373 behavior of the slurmdbd.
374
375 PreserveCaseUser
376 When defining users do not force lower case which is the
377 default behavior.
378
379 PidFile
380 Fully qualified pathname of a file into which the Slurm Database
381 Daemon may write its process ID. This may be used for automated
382 signal processing. The default value is "/var/run/slur‐
383 mdbd.pid".
384
385 PluginDir
386 Identifies the places in which to look for Slurm plugins. This
387 is a colon-separated list of directories, like the PATH environ‐
388 ment variable. The default value is the prefix given at config‐
389 ure time + "/lib/slurm".
390
391 PrivateData
392 This controls what type of information is hidden from regular
393 users. By default, all information is visible to all users.
394 User SlurmUser, root, and users with AdminLevel=Admin can always
395 view all information. Multiple values may be specified with a
396 comma separator. Acceptable values include:
397
398 accounts
399 prevents users from viewing any account definitions un‐
400 less they are coordinators of them.
401
402 events prevents users from viewing event information unless they
403 have operator status or above.
404
405 jobs prevents users from viewing job records belonging to
406 other users unless they are coordinators of the account
407 running the job when using sacct.
408
409 reservations
410 restricts getting reservation information to users with
411 operator status and above.
412
413 usage prevents users from viewing usage of any other user.
414 This applies to sreport.
415
416 users prevents users from viewing information of any user other
417 than themselves, this also makes it so users can only see
418 associations they deal with. Coordinators can see asso‐
419 ciations of all users in the account they are coordinator
420 of, but can only see themselves when listing users.
421
422 PurgeEventAfter
423 Events happening on the cluster over this age are purged from
424 the database. This includes node down times and such. The time
425 is a numeric value and is a number of months. If you want to
426 purge more often you can include "hours", or "days" behind the
427 numeric value to get those more frequent purges (i.e. a value of
428 "12hours" would purge everything older than 12 hours). The
429 purge takes place at the start of the each purge interval. For
430 example, if the purge time is 2 months, the purge would happen
431 at the beginning of each month. If not set (default), then
432 event records are never purged.
433
434 PurgeJobAfter
435 Individual job records over this age are purged from the data‐
436 base. Aggregated information will be preserved to
437 "PurgeUsageAfter". The time is a numeric value and is a number
438 of months. If you want to purge more often you can include
439 "hours", or "days" behind the numeric value to get those more
440 frequent purges (i.e. a value of "12hours" would purge every‐
441 thing older than 12 hours). The purge takes place at the start
442 of the each purge interval. For example, if the purge time is 2
443 months, the purge would happen at the beginning of each month.
444 If not set (default), then job records are never purged.
445
446 PurgeResvAfter
447 Individual reservation records over this age are purged from the
448 database. Aggregated information will be preserved to
449 "PurgeUsageAfter". The time is a numeric value and is a number
450 of months. If you want to purge more often you can include
451 "hours", or "days" behind the numeric value to get those more
452 frequent purges (i.e. a value of "12hours" would purge every‐
453 thing older than 12 hours). The purge takes place at the start
454 of the each purge interval. For example, if the purge time is 2
455 months, the purge would happen at the beginning of each month.
456 If not set (default), then reservation records are never purged.
457
458 PurgeStepAfter
459 Individual job step records over this age are purged from the
460 database. Aggregated information will be preserved to
461 "PurgeUsageAfter". The time is a numeric value and is a number
462 of months. If you want to purge more often you can include
463 "hours", or "days" behind the numeric value to get those more
464 frequent purges (i.e. a value of "12hours" would purge every‐
465 thing older than 12 hours). The purge takes place at the start
466 of the each purge interval. For example, if the purge time is 2
467 months, the purge would happen at the beginning of each month.
468 If not set (default), then job step records are never purged.
469
470 PurgeSuspendAfter
471 Records of individual suspend times for jobs over this age are
472 purged from the database. Aggregated information will be pre‐
473 served to "PurgeUsageAfter". The time is a numeric value and is
474 a number of months. If you want to purge more often you can in‐
475 clude "hours", or "days" behind the numeric value to get those
476 more frequent purges (i.e. a value of "12hours" would purge ev‐
477 erything older than 12 hours). The purge takes place at the
478 start of the each purge interval. For example, if the purge
479 time is 2 months, the purge would happen at the beginning of
480 each month. If not set (default), then suspend records are
481 never purged.
482
483 PurgeTXNAfter
484 Records of individual transaction times for transactions over
485 this age are purged from the database. The time is a numeric
486 value and is a number of months. If you want to purge more of‐
487 ten you can include "hours", or "days" behind the numeric value
488 to get those more frequent purges (i.e. a value of "12hours"
489 would purge everything older than 12 hours). The purge takes
490 place at the start of the each purge interval. For example, if
491 the purge time is 2 months, the purge would happen at the begin‐
492 ning of each month. If not set (default), then transaction
493 records are never purged.
494
495 PurgeUsageAfter
496 Usage Records (Cluster, Association and WCKey) over this age are
497 purged from the database. The time is a numeric value and is a
498 number of months. If you want to purge more often you can in‐
499 clude "hours", or "days" behind the numeric value to get those
500 more frequent purges (i.e. a value of "12hours" would purge ev‐
501 erything older than 12 hours). The purge takes place at the
502 start of the each purge interval. For example, if the purge
503 time is 2 months, the purge would happen at the beginning of
504 each month. If not set (default), then usage records are never
505 purged.
506
507 SlurmUser
508 The name of the user that the slurmdbd daemon executes as. This
509 user should match the SlurmUser used for all instances of slurm‐
510 ctld that report to slurmdbd. It must exist on the machine exe‐
511 cuting the Slurm Database Daemon and have the same UID as the
512 hosts on which slurmctld executes. For security purposes, a
513 user other than "root" is recommended. The default value is
514 "root".
515
516 NOTE: If the SlurmUser defined for slurmctld is not root and is
517 different than the SlurmUser defined for slurmdbd, the user used
518 for slurmctld must be added to accounting with AdminLevel=Admin
519 and slurmctld must be restarted.
520
521 StorageHost
522 Define the name of the host the database is running where we are
523 going to store the data. Ideally this should be the host on
524 which slurmdbd executes.
525
526 StorageBackupHost
527 Define the name of the backup host the database is running where
528 we are going to store the data. This can be viewed as a backup
529 solution when the StorageHost is not responding. It is up to
530 the backup solution to enforce the coherency of the accounting
531 information between the two hosts. With clustered database solu‐
532 tions (active/passive HA), you would not need to use this fea‐
533 ture. Default is none.
534
535 StorageLoc
536 Specify the name of the database as the location where account‐
537 ing records are written. Defaults to "slurm_acct_db".
538
539 StorageParameters
540 Comma separated list of key-value pair parameters. Currently
541 supported values include options to establish a secure connec‐
542 tion to the database:
543
544 SSL_CERT
545 The path name of the client public key certificate file.
546
547 SSL_CA
548 The path name of the Certificate Authority (CA) certificate
549 file.
550
551 SSL_CAPATH
552 The path name of the directory that contains trusted SSL CA
553 certificate files.
554
555 SSL_KEY
556 The path name of the client private key file.
557
558 SSL_CIPHER
559 The list of permissible ciphers for SSL encryption.
560
561 StoragePass
562 Define the password used to gain access to the database to store
563 the job accounting data. The '#' character is not permitted in a
564 password.
565
566 StoragePort
567 The port number that the Slurm Database Daemon (slurmdbd) commu‐
568 nicates with the database. Default is 3306.
569
570 StorageType
571 Define the accounting storage mechanism type. Acceptable values
572 at present include "accounting_storage/mysql". The value "ac‐
573 counting_storage/mysql" indicates that accounting records should
574 be written to a MySQL or MariaDB database specified by the Stor‐
575 ageLoc parameter. This value must be specified.
576
577 StorageUser
578 Define the name of the user we are going to connect to the data‐
579 base with to store the job accounting data.
580
581 TCPTimeout
582 Time permitted for TCP connection to be established. Default
583 value is 2 seconds.
584
585 TrackSlurmctldDown
586 Boolean yes or no. If set the slurmdbd will mark all idle re‐
587 sources on the cluster as down when a slurmctld disconnects or
588 is no longer reachable. The default is no.
589
590 TrackWCKey
591 Boolean yes or no. Used to set display and track of the Work‐
592 load Characterization Key. Must be set to track wckey usage.
593 This must be set to generate rolled up usage tables from WCKeys.
594 NOTE: If TrackWCKey is set here and not in your various
595 slurm.conf files all jobs will be attributed to their default
596 WCKey.
597
599 #
600 # Sample /etc/slurmdbd.conf
601 #
602 ArchiveEvents=yes
603 ArchiveJobs=yes
604 ArchiveResvs=yes
605 ArchiveSteps=no
606 ArchiveSuspend=no
607 ArchiveTXN=no
608 ArchiveUsage=no
609 #ArchiveScript=/usr/sbin/slurm.dbd.archive
610 AuthInfo=/var/run/munge/munge.socket.2
611 AuthType=auth/munge
612 DbdHost=db_host
613 DebugLevel=info
614 PurgeEventAfter=1month
615 PurgeJobAfter=12month
616 PurgeResvAfter=1month
617 PurgeStepAfter=1month
618 PurgeSuspendAfter=1month
619 PurgeTXNAfter=12month
620 PurgeUsageAfter=24month
621 LogFile=/var/log/slurmdbd.log
622 PidFile=/var/run/slurmdbd.pid
623 SlurmUser=slurm_mgr
624 StoragePass=password_to_database
625 StorageType=accounting_storage/mysql
626 StorageUser=database_mgr
627
628
630 Copyright (C) 2008-2010 Lawrence Livermore National Security. Produced
631 at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
632 Copyright (C) 2010-2022 SchedMD LLC.
633
634 This file is part of Slurm, a resource management program. For de‐
635 tails, see <https://slurm.schedmd.com/>.
636
637 Slurm is free software; you can redistribute it and/or modify it under
638 the terms of the GNU General Public License as published by the Free
639 Software Foundation; either version 2 of the License, or (at your op‐
640 tion) any later version.
641
642 Slurm is distributed in the hope that it will be useful, but WITHOUT
643 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
644 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
645 for more details.
646
647
649 /etc/slurmdbd.conf
650
651
653 slurm.conf(5), slurmctld(8), slurmdbd(8) syslog (2)
654
655
656
657April 2022 Slurm Configuration File slurmdbd.conf(5)