1slurmdbd.conf(5) Slurm Configuration File slurmdbd.conf(5)
2
3
4
6 slurmdbd.conf - Slurm Database Daemon (SlurmDBD) configuration file
7
8
10 slurmdb.conf is an ASCII file which describes Slurm Database Daemon
11 (SlurmDBD) configuration information. The file location can be modi‐
12 fied at system build time using the DEFAULT_SLURM_CONF parameter or at
13 execution time by setting the SLURM_CONF environment variable.
14
15 The contents of the file are case insensitive except for the names of
16 nodes and files. Any text following a "#" in the configuration file is
17 treated as a comment through the end of that line. Changes to the con‐
18 figuration file take effect upon restart of SlurmDbd or daemon receipt
19 of the SIGHUP signal unless otherwise noted.
20
21 This file should be only on the computer where SlurmDBD executes and
22 should only be readable by the user which executes SlurmDBD (e.g.
23 "slurm"). If the slurmdbd daemon is started as user root and changes
24 to another user ID, the configuration file will initially be read as
25 user root, but will be read as the other user ID in response to a
26 SIGHUP signal. This file should be protected from unauthorized access
27 since it contains a database password. The overall configuration
28 parameters available include:
29
30
31 ArchiveDir
32 If ArchiveScript is not set the slurmdbd will generate a file
33 that can be read in anytime with sacctmgr load filename. This
34 directory is where the file will be placed after a purge event
35 has happened and archive for that element is set to true.
36 Default is /tmp. The format for this files name is
37 $ArchiveDir/$ClusterName_$ArchiveObject_archive_$BeginTimeS‐
38 tamp_$endTimeStamp We limit archive files to 50000 records per
39 file. If more than 50000 records exist during that time period,
40 they will be written to a new file. Subsequent archive files
41 during the same time period will have ".<number>" appended to
42 the file, for example .2, with the number increasing by one for
43 each file in the same time period.
44
45
46 ArchiveEvents
47 When purging events also archive them. Boolean, yes to archive
48 event data, no otherwise. Default is no.
49
50
51 ArchiveJobs
52 When purging jobs also archive them. Boolean, yes to archive
53 job data, no otherwise. Default is no.
54
55
56 ArchiveResvs
57 When purging reservations also archive them. Boolean, yes to
58 archive reservation data, no otherwise. Default is no.
59
60
61 ArchiveScript
62 This script can be executed every time a rollup happens (every
63 hour, day and month), depending on the Purge*After options.
64 This script is used to transfer accounting records out of the
65 database into an archive. It is used in place of the internal
66 process used to archive objects. The script is executed with a
67 no arguments, The following environment variables are set.
68
69 SLURM_ARCHIVE_EVENTS
70 1 for archive events 0 otherwise.
71
72 SLURM_ARCHIVE_LAST_EVENT
73 Time of last event start to archive.
74
75 SLURM_ARCHIVE_JOBS
76 1 for archive jobs 0 otherwise.
77
78 SLURM_ARCHIVE_LAST_JOB
79 Time of last job submit to archive.
80
81 SLURM_ARCHIVE_STEPS
82 1 for archive steps 0 otherwise.
83
84 SLURM_ARCHIVE_LAST_STEP
85 Time of last step start to archive.
86
87 SLURM_ARCHIVE_SUSPEND
88 1 for archive suspend data 0 otherwise.
89
90 SLURM_ARCHIVE_TXN
91 1 for archive transaction data 0 otherwise.
92
93 SLURM_ARCHIVE_USAGE
94 1 for archive usage data 0 otherwise.
95
96 SLURM_ARCHIVE_LAST_SUSPEND
97 Time of last suspend start to archive.
98
99
100
101 ArchiveSteps
102 When purging steps also archive them. Boolean, yes to archive
103 step data, no otherwise. Default is no.
104
105
106 ArchiveSuspend
107 When purging suspend data also archive it. Boolean, yes to ar‐
108 chive suspend data, no otherwise. Default is no.
109
110
111 ArchiveTXN
112 When purging transaction data also archive it. Boolean, yes to
113 archive transaction data, no otherwise. Default is no.
114
115
116 ArchiveUsage
117 When purging usage data (Cluster, Association and WCKey) also
118 archive it. Boolean, yes to archive transaction data, no other‐
119 wise. Default is no.
120
121
122 AuthInfo
123 Additional information to be used for authentication of communi‐
124 cations with the Slurm control daemon (slurmctld) on each clus‐
125 ter. The interpretation of this option is specific to the con‐
126 figured AuthType. In the case of auth/munge, this can be con‐
127 figured to use a Munge daemon specifically configured to provide
128 authentication between clusters while the default Munge daemon
129 provides authentication within a cluster. In that case, this
130 will specify the pathname of the socket to use. Per default this
131 value is left unspecified, which results in the default authen‐
132 tication mechanism being used.
133
134
135 AuthAltTypes
136 Command separated list of alternative authentication plugins
137 that the slurmdbd will permit for communication.
138
139
140 AuthAltParameters
141 Used to define alternative authentication plugins options. Mul‐
142 tiple options may be comma separated.
143
144 jwt_key= Absolute path to JWT key file. Key must be HS256,
145 and should only be accessible by SlurmUser.
146
147
148 AuthType
149 Define the authentication method for communications between
150 Slurm components. Acceptable values at present include
151 "auth/none" and "auth/munge". The default value is
152 "auth/munge". Do not use "auth/none" if you desire any secu‐
153 rity. "auth/munge" indicates that LLNL's MUNGE system is to be
154 used (this is the supported authentication mechanism for Slurm;
155 see "https://dun.github.io/munge/" for more information). Slur‐
156 mDBD must be terminated prior to changing the value of AuthType
157 and later restarted.
158
159
160 CommitDelay
161 How many seconds between commits on a connection from a Slurm‐
162 ctld. This speeds up inserts into the database dramatically.
163 If you are running a very high throughput of jobs you should
164 consider setting this. In testing, 1 second improves the slur‐
165 mdbd performance dramatically and reduces overhead. There is a
166 small probability of data loss though since this creates a win‐
167 dow in which if the slurmdbd seg faults or exits abnormally for
168 any reason the data not committed could be lost. While this
169 situation should be very rare, it does present an extremely
170 small risk, but may be the only way to run in extremely heavy
171 environments. In all honesty, the risk is quite low, but still
172 present.
173
174
175 CommunicationParameters
176 Comma separated options identifying communication options.
177
178 DisableIPv4 Disable IPv4 only operation for the slurmdbd.
179 This should also be set in your slurm.conf file.
180
181 EnableIPv6 Enable using IPv6 addresses for the slurmdbd.
182 When using both IPv4 and IPv6, address family
183 preferences will be based on your /etc/gai.conf
184 file. This should also be set in your slurm.conf
185 file.
186
187
188 DbdBackupHost
189 The short, or long, name of the machine where the backup Slurm
190 Database Daemon is executed (i.e. the name returned by the com‐
191 mand "hostname -s"). This host must have access to the same
192 underlying database specified by the 'Storage' options mentioned
193 below.
194
195
196 DbdAddr
197 Name that DbdHost should be referred to in establishing a commu‐
198 nications path. This name will be used as an argument to the
199 getaddrinfo() function for identification. For example,
200 "elx0000" might be used to designate the Ethernet address for
201 node "lx0000". By default the DbdAddr will be identical in
202 value to DbdHost.
203
204
205 DbdHost
206 The short, or long, name of the machine where the Slurm Database
207 Daemon is executed (i.e. the name returned by the command "host‐
208 name -s"). This value must be specified.
209
210
211 DbdPort
212 The port number that the Slurm Database Daemon (slurmdbd) lis‐
213 tens to for work. The default value is SLURMDBD_PORT as estab‐
214 lished at system build time. If no value is explicitly speci‐
215 fied, it will be set to 6819. This value must be equal to the
216 AccountingStoragePort parameter in the slurm.conf file.
217
218
219 DebugFlags
220 Defines specific subsystems which should provide more detailed
221 event logging. Multiple subsystems can be specified with comma
222 separators. Most DebugFlags will result in verbose logging for
223 the identified subsystems and could impact performance. Valid
224 subsystems available today (with more to come) include:
225
226 DB_ARCHIVE
227 SQL statements/queries when dealing with archiving and
228 purging the database.
229
230 DB_ASSOC
231 SQL statements/queries when dealing with associations in
232 the database.
233
234 DB_EVENT
235 SQL statements/queries when dealing with (node) events in
236 the database.
237
238 DB_JOB
239 SQL statements/queries when dealing with jobs in the
240 database.
241
242 DB_QOS
243 SQL statements/queries when dealing with QOS in the data‐
244 base.
245
246 DB_QUERY
247 SQL statements/queries when dealing with transactions and
248 such in the database.
249
250 DB_RESERVATION
251 SQL statements/queries when dealing with reservations in
252 the database.
253
254 DB_RESOURCE
255 SQL statements/queries when dealing with resources like
256 licenses in the database.
257
258 DB_STEP
259 SQL statements/queries when dealing with steps in the
260 database.
261
262 DB_TRES
263 SQL statements/queries when dealing with trackable
264 resources in the database.
265
266 DB_USAGE
267 SQL statements/queries when dealing with usage queries
268 and inserts in the database.
269
270 DB_WCKEY
271 SQL statements/queries when dealing with wckeys in the
272 database.
273
274 FEDERATION
275 SQL statements/queries when dealing with federations in
276 the database.
277
278
279 DebugLevel
280 The level of detail to provide the Slurm Database Daemon's logs.
281 The default value is info.
282
283 quiet Log nothing
284
285 fatal Log only fatal errors
286
287 error Log only errors
288
289 info Log errors and general informational messages
290
291 verbose Log errors and verbose informational messages
292
293 debug Log errors and verbose informational messages and
294 debugging messages
295
296 debug2 Log errors and verbose informational messages and more
297 debugging messages
298
299 debug3 Log errors and verbose informational messages and even
300 more debugging messages
301
302 debug4 Log errors and verbose informational messages and even
303 more debugging messages
304
305 debug5 Log errors and verbose informational messages and even
306 more debugging messages
307
308
309 DebugLevelSyslog
310 The slurmdbd daemon will log events to the syslog file at the
311 specified level of detail. If not set, the slurmdbd daemon will
312 log to syslog at level fatal, unless there is no LogFile and it
313 is running in the background, in which case it will log to sys‐
314 log at the level specified by DebugLevel (at fatal in the case
315 that DebugLevel is set to quiet) or it is run in the foreground,
316 when it will be set to quiet.
317
318
319 quiet Log nothing
320
321 fatal Log only fatal errors
322
323 error Log only errors
324
325 info Log errors and general informational messages
326
327 verbose Log errors and verbose informational messages
328
329 debug Log errors and verbose informational messages and
330 debugging messages
331
332 debug2 Log errors and verbose informational messages and more
333 debugging messages
334
335 debug3 Log errors and verbose informational messages and even
336 more debugging messages
337
338 debug4 Log errors and verbose informational messages and even
339 more debugging messages
340
341 debug5 Log errors and verbose informational messages and even
342 more debugging messages
343
344
345
346 DefaultQOS
347 When adding a new cluster this will be used as the qos for the
348 cluster unless something is explicitly set by the admin with the
349 create.
350
351
352 LogFile
353 Fully qualified pathname of a file into which the Slurm Database
354 Daemon's logs are written. The default value is none (performs
355 logging via syslog).
356 See the section LOGGING in the slurm.conf man page if a pathname
357 is specified.
358
359
360 LogTimeFormat
361 Format of the timestamp in slurmdbd log files. Accepted values
362 are "iso8601", "iso8601_ms", "rfc5424", "rfc5424_ms", "clock",
363 and "short". The values ending in "_ms" differ from the ones
364 without in that fractional seconds with millisecond precision
365 are printed. The default value is "iso8601_ms". The "rfc5424"
366 formats are the same as the "iso8601" formats except that the
367 timezone value is also shown. The "clock" format shows a time‐
368 stamp in microseconds retrieved with the C standard clock()
369 function. The "short" format is a short date and time format.
370 The "thread_id" format shows the timestamp in the C standard
371 ctime() function form without the year but including the
372 microseconds, the daemon's process ID and the current thread ID.
373
374
375 MaxQueryTimeRange
376 Return an error if a query is against too large of a time span,
377 to prevent ill-formed queries from causing performance problems
378 within SlurmDBD. Default value is INFINITE which allows any
379 queries to proceed. Accepted time formats are the same as the
380 MaxTime option in slurm.conf. User SlurmUser and root are
381 exempt from this restriction. Note that queries which attempt
382 to return over 3GB of data will still fail to complete with
383 ESLURM_RESULT_TOO_LARGE.
384
385
386 MessageTimeout
387 Time permitted for a round-trip communication to complete in
388 seconds. Default value is 10 seconds.
389
390
391 Parameters
392 Contains arbitrary comma separated parameters used to alter the
393 behavior of the slurmdbd.
394
395 PreserveCaseUser
396 When defining users do not force lower case which is the
397 default behavior.
398
399
400 PidFile
401 Fully qualified pathname of a file into which the Slurm Database
402 Daemon may write its process ID. This may be used for automated
403 signal processing. The default value is "/var/run/slur‐
404 mdbd.pid".
405
406
407 PluginDir
408 Identifies the places in which to look for Slurm plugins. This
409 is a colon-separated list of directories, like the PATH environ‐
410 ment variable. The default value is the prefix given at config‐
411 ure time + "/lib/slurm".
412
413
414 PrivateData
415 This controls what type of information is hidden from regular
416 users. By default, all information is visible to all users.
417 User SlurmUser, root, and users with AdminLevel=Admin can always
418 view all information. Multiple values may be specified with a
419 comma separator. Acceptable values include:
420
421 accounts
422 prevents users from viewing any account definitions
423 unless they are coordinators of them.
424
425 events prevents users from viewing event information unless they
426 have operator status or above.
427
428 jobs prevents users from viewing job records belonging to
429 other users unless they are coordinators of the account
430 running the job when using sacct.
431
432 reservations
433 restricts getting reservation information to users with
434 operator status and above.
435
436 usage prevents users from viewing usage of any other user.
437 This applys to sreport.
438
439 users prevents users from viewing information of any user other
440 than themselves, this also makes it so users can only see
441 associations they deal with. Coordinators can see asso‐
442 ciations of all users in the account they are coordinator
443 of, but can only see themselves when listing users.
444
445
446 PurgeEventAfter
447 Events happening on the cluster over this age are purged from
448 the database. This includes node down times and such. The time
449 is a numeric value and is a number of months. If you want to
450 purge more often you can include "hours", or "days" behind the
451 numeric value to get those more frequent purges (i.e. a value of
452 "12hours" would purge everything older than 12 hours). The
453 purge takes place at the start of the each purge interval. For
454 example, if the purge time is 2 months, the purge would happen
455 at the beginning of each month. If not set (default), then
456 event records are never purged.
457
458
459 PurgeJobAfter
460 Individual job records over this age are purged from the data‐
461 base. Aggregated information will be preserved to
462 "PurgeUsageAfter". The time is a numeric value and is a number
463 of months. If you want to purge more often you can include
464 "hours", or "days" behind the numeric value to get those more
465 frequent purges (i.e. a value of "12hours" would purge every‐
466 thing older than 12 hours). The purge takes place at the start
467 of the each purge interval. For example, if the purge time is 2
468 months, the purge would happen at the beginning of each month.
469 If not set (default), then job records are never purged.
470
471
472 PurgeResvAfter
473 Individual reservation records over this age are purged from the
474 database. Aggregated information will be preserved to
475 "PurgeUsageAfter". The time is a numeric value and is a number
476 of months. If you want to purge more often you can include
477 "hours", or "days" behind the numeric value to get those more
478 frequent purges (i.e. a value of "12hours" would purge every‐
479 thing older than 12 hours). The purge takes place at the start
480 of the each purge interval. For example, if the purge time is 2
481 months, the purge would happen at the beginning of each month.
482 If not set (default), then reservation records are never purged.
483
484
485 PurgeStepAfter
486 Individual job step records over this age are purged from the
487 database. Aggregated information will be preserved to
488 "PurgeUsageAfter". The time is a numeric value and is a number
489 of months. If you want to purge more often you can include
490 "hours", or "days" behind the numeric value to get those more
491 frequent purges (i.e. a value of "12hours" would purge every‐
492 thing older than 12 hours). The purge takes place at the start
493 of the each purge interval. For example, if the purge time is 2
494 months, the purge would happen at the beginning of each month.
495 If not set (default), then job step records are never purged.
496
497
498 PurgeSuspendAfter
499 Records of individual suspend times for jobs over this age are
500 purged from the database. Aggregated information will be pre‐
501 served to "PurgeUsageAfter". The time is a numeric value and is
502 a number of months. If you want to purge more often you can
503 include "hours", or "days" behind the numeric value to get those
504 more frequent purges (i.e. a value of "12hours" would purge
505 everything older than 12 hours). The purge takes place at the
506 start of the each purge interval. For example, if the purge
507 time is 2 months, the purge would happen at the beginning of
508 each month. If not set (default), then suspend records are
509 never purged.
510
511
512 PurgeTXNAfter
513 Records of individual transaction times for transactions over
514 this age are purged from the database. The time is a numeric
515 value and is a number of months. If you want to purge more
516 often you can include "hours", or "days" behind the numeric
517 value to get those more frequent purges (i.e. a value of
518 "12hours" would purge everything older than 12 hours). The
519 purge takes place at the start of the each purge interval. For
520 example, if the purge time is 2 months, the purge would happen
521 at the beginning of each month. If not set (default), then
522 transaction records are never purged.
523
524
525 PurgeUsageAfter
526 Usage Records (Cluster, Association and WCKey) over this age are
527 purged from the database. The time is a numeric value and is a
528 number of months. If you want to purge more often you can
529 include "hours", or "days" behind the numeric value to get those
530 more frequent purges (i.e. a value of "12hours" would purge
531 everything older than 12 hours). The purge takes place at the
532 start of the each purge interval. For example, if the purge
533 time is 2 months, the purge would happen at the beginning of
534 each month. If not set (default), then usage records are never
535 purged.
536
537
538 SlurmUser
539 The name of the user that the slurmdbd daemon executes as. This
540 user must exist on the machine executing the Slurm Database Dae‐
541 mon and have the same UID as the hosts on which slurmctld exe‐
542 cute. For security purposes, a user other than "root" is recom‐
543 mended. The default value is "root". This name should also be
544 the same SlurmUser on all clusters reporting to the SlurmDBD.
545 NOTE: If this user is different from the one set for slurmctld
546 and is not root, it must be added to accounting with Admin‐
547 Level=Admin and slurmctld must be restarted.
548
549
550 StorageHost
551 Define the name of the host the database is running where we are
552 going to store the data. Ideally this should be the host on
553 which slurmdbd executes.
554
555
556 StorageBackupHost
557 Define the name of the backup host the database is running where
558 we are going to store the data. This can be viewed as a backup
559 solution when the StorageHost is not responding. It is up to
560 the backup solution to enforce the coherency of the accounting
561 information between the two hosts. With clustered database solu‐
562 tions (active/passive HA), you would not need to use this fea‐
563 ture. Default is none.
564
565
566 StorageLoc
567 Specify the name of the database as the location where account‐
568 ing records are written. Defaults to "slurm_acct_db".
569
570
571 StorageParameters
572 Comma separated list of key-value pair parameters. Currently
573 supported values include options to establish a secure connec‐
574 tion to the database:
575
576 SSL_CERT
577 The path name of the client public key certificate file.
578
579 SSL_CA
580 The path name of the Certificate Authority (CA) certificate
581 file.
582
583 SSL_CAPATH
584 The path name of the directory that contains trusted SSL CA
585 certificate files.
586
587 SSL_KEY
588 The path name of the client private key file.
589
590 SSL_CIPHER
591 The list of permissible ciphers for SSL encryption.
592
593
594 StoragePass
595 Define the password used to gain access to the database to store
596 the job accounting data. The '#' character is not permitted in a
597 password.
598
599
600 StoragePort
601 The port number that the Slurm Database Daemon (slurmdbd) commu‐
602 nicates with the database. Default is 3306.
603
604
605 StorageType
606 Define the accounting storage mechanism type. Acceptable values
607 at present include "accounting_storage/mysql". The value
608 "accounting_storage/mysql" indicates that accounting records
609 should be written to a MySQL or MariaDB database specified by
610 the StorageLoc parameter. This value must be specified.
611
612
613 StorageUser
614 Define the name of the user we are going to connect to the data‐
615 base with to store the job accounting data.
616
617
618 TCPTimeout
619 Time permitted for TCP connection to be established. Default
620 value is 2 seconds.
621
622
623 TrackSlurmctldDown
624 Boolean yes or no. If set the slurmdbd will mark all idle
625 resources on the cluster as down when a slurmctld disconnects or
626 is no longer reachable. The default is no.
627
628
629 TrackWCKey
630 Boolean yes or no. Used to set display and track of the Work‐
631 load Characterization Key. Must be set to track wckey usage.
632 This must be set to generate rolled up usage tables from WCKeys.
633 NOTE: If TrackWCKey is set here and not in your various
634 slurm.conf files all jobs will be attributed to their default
635 WCKey.
636
637
639 #
640 # Sample /etc/slurmdbd.conf
641 #
642 ArchiveEvents=yes
643 ArchiveJobs=yes
644 ArchiveResvs=yes
645 ArchiveSteps=no
646 ArchiveSuspend=no
647 ArchiveTXN=no
648 ArchiveUsage=no
649 #ArchiveScript=/usr/sbin/slurm.dbd.archive
650 AuthInfo=/var/run/munge/munge.socket.2
651 AuthType=auth/munge
652 DbdHost=db_host
653 DebugLevel=info
654 PurgeEventAfter=1month
655 PurgeJobAfter=12month
656 PurgeResvAfter=1month
657 PurgeStepAfter=1month
658 PurgeSuspendAfter=1month
659 PurgeTXNAfter=12month
660 PurgeUsageAfter=24month
661 LogFile=/var/log/slurmdbd.log
662 PidFile=/var/run/slurmdbd.pid
663 SlurmUser=slurm_mgr
664 StoragePass=password_to_database
665 StorageType=accounting_storage/mysql
666 StorageUser=database_mgr
667
668
670 Copyright (C) 2008-2010 Lawrence Livermore National Security. Produced
671 at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
672 Copyright (C) 2010-2014 SchedMD LLC.
673
674 This file is part of Slurm, a resource management program. For
675 details, see <https://slurm.schedmd.com/>.
676
677 Slurm is free software; you can redistribute it and/or modify it under
678 the terms of the GNU General Public License as published by the Free
679 Software Foundation; either version 2 of the License, or (at your
680 option) any later version.
681
682 Slurm is distributed in the hope that it will be useful, but WITHOUT
683 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
684 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
685 for more details.
686
687
689 /etc/slurmdbd.conf
690
691
693 slurm.conf(5), slurmctld(8), slurmdbd(8) syslog (2)
694
695
696
697November 2020 Slurm Configuration File slurmdbd.conf(5)