1slurmdbd.conf(5)           Slurm Configuration File           slurmdbd.conf(5)
2
3
4

NAME

6       slurmdbd.conf - Slurm Database Daemon (SlurmDBD) configuration file
7
8

DESCRIPTION

10       slurmdb.conf  is  an  ASCII  file which describes Slurm Database Daemon
11       (SlurmDBD) configuration information.  The file location can  be  modi‐
12       fied  at system build time using the DEFAULT_SLURM_CONF parameter or at
13       execution time by setting the SLURM_CONF environment variable.
14
15       The contents of the file are case insensitive except for the  names  of
16       nodes  and files. Any text following a "#" in the configuration file is
17       treated as a comment through the end of that line.  Changes to the con‐
18       figuration  file take effect upon restart of SlurmDbd or daemon receipt
19       of the SIGHUP signal unless otherwise noted.
20
21       This file should be only on the computer where  SlurmDBD  executes  and
22       should  only  be  readable  by  the  user which executes SlurmDBD (e.g.
23       "slurm").  If the slurmdbd daemon is started as user root  and  changes
24       to  another  user  ID, the configuration file will initially be read as
25       user root, but will be read as the other  user  ID  in  response  to  a
26       SIGHUP  signal.  This file should be protected from unauthorized access
27       since it contains  a  database  password.   The  overall  configuration
28       parameters available include:
29
30
31       ArchiveDir
32              If  ArchiveScript  is  not set the slurmdbd will generate a file
33              that can be read in anytime with sacctmgr load  filename.   This
34              directory  is  where the file will be placed after a purge event
35              has happened and archive  for  that  element  is  set  to  true.
36              Default is /tmp.  The format for this files name is
37              $ArchiveDir/$ClusterName_$ArchiveObject_archive_$BeginTimeS‐
38              tamp_$endTimeStamp
39
40
41       ArchiveEvents
42              When  purging events also archive them.  Boolean, yes to archive
43              event data, no otherwise.  Default is no.
44
45
46       ArchiveJobs
47              When purging jobs also archive them.  Boolean,  yes  to  archive
48              job data, no otherwise.  Default is no.
49
50
51       ArchiveResvs
52              When  purging  reservations  also archive them.  Boolean, yes to
53              archive reservation data, no otherwise.  Default is no.
54
55
56       ArchiveScript
57              This script can be executed every time a rollup  happens  (every
58              hour,  day  and  month),  depending  on the Purge*After options.
59              This script is used to transfer accounting records  out  of  the
60              database  into  an archive.  It is used in place of the internal
61              process used to archive objects.  The script is executed with  a
62              no arguments, The following environment variables are set.
63
64              SLURM_ARCHIVE_EVENTS
65                     1 for archive events 0 otherwise.
66
67              SLURM_ARCHIVE_LAST_EVENT
68                     Time of last event start to archive.
69
70              SLURM_ARCHIVE_JOBS
71                     1 for archive jobs 0 otherwise.
72
73              SLURM_ARCHIVE_LAST_JOB
74                     Time of last job submit to archive.
75
76              SLURM_ARCHIVE_STEPS
77                     1 for archive steps 0 otherwise.
78
79              SLURM_ARCHIVE_LAST_STEP
80                     Time of last step start to archive.
81
82              SLURM_ARCHIVE_SUSPEND
83                     1 for archive suspend data 0 otherwise.
84
85              SLURM_ARCHIVE_TXN
86                     1 for archive transaction data 0 otherwise.
87
88              SLURM_ARCHIVE_USAGE
89                     1 for archive usage data 0 otherwise.
90
91              SLURM_ARCHIVE_LAST_SUSPEND
92                     Time of last suspend start to archive.
93
94
95
96       ArchiveSteps
97              When  purging  steps  also archive them.  Boolean,
98              yes to archive step data, no  otherwise.   Default
99              is no.
100
101
102       ArchiveSuspend
103              When  purging suspend data also archive it.  Bool‐
104              ean, yes to archive suspend  data,  no  otherwise.
105              Default is no.
106
107
108       ArchiveTXN
109              When  purging  transaction  data  also archive it.
110              Boolean, yes to archive transaction data, no  oth‐
111              erwise.  Default is no.
112
113
114       ArchiveUsage
115              When  purging usage data (Cluster, Association and
116              WCKey) also archive it.  Boolean, yes  to  archive
117              transaction data, no otherwise.  Default is no.
118
119
120       AuthInfo
121              Additional  information to be used for authentica‐
122              tion of communications with the Slurm control dae‐
123              mon  (slurmctld) on each cluster.  The interpreta‐
124              tion of this option is specific to the  configured
125              AuthType.   In the case of auth/munge, this can be
126              configured to use a Munge daemon specifically con‐
127              figured to provide authentication between clusters
128              while the default Munge daemon provides  authenti‐
129              cation  within a cluster.  In that case, this will
130              specify the pathname of the  socket  to  use.  Per
131              default  this  value  is  left  unspecified, which
132              results in the  default  authentication  mechanism
133              being used.
134
135
136       AuthType
137              Define  the  authentication  method for communica‐
138              tions between Slurm components.  Acceptable values
139              at  present  include "auth/none" and "auth/munge".
140              The default value is  "auth/munge".   Do  not  use
141              "auth/none"    if   you   desire   any   security.
142              "auth/munge" indicates that LLNL's MUNGE system is
143              to  be  used (this is the supported authentication
144              mechanism         for          Slurm;          see
145              "https://dun.github.io/munge/"  for  more informa‐
146              tion).   SlurmDBD  must  be  terminated  prior  to
147              changing   the   value   of   AuthType  and  later
148              restarted.
149
150
151       CommitDelay
152              How many seconds between commits on  a  connection
153              from a Slurmctld.  This speeds up inserts into the
154              database dramatically.  If you are running a  very
155              high  throughput  of jobs you should consider set‐
156              ting this.  In  testing,  1  second  improves  the
157              slurmdbd   performance  dramatically  and  reduces
158              overhead.  There is a small  probability  of  data
159              loss  though  since this creates a window in which
160              if the slurmdbd seg faults or exits abnormally for
161              any  reason  the data not committed could be lost.
162              While this situation should be very rare, it  does
163              present  an  extremely  small risk, but may be the
164              only way to run in extremely  heavy  environments.
165              In  all  honesty, the risk is quite low, but still
166              present.
167
168
169       DbdBackupHost
170              The short, or long, name of the machine where  the
171              backup Slurm Database Daemon is executed (i.e. the
172              name returned by the command "hostname -s").  This
173              host must have access to the same underlying data‐
174              base specified by the 'Storage' options  mentioned
175              below.
176
177
178       DbdAddr
179              Name  that DbdHost should be referred to in estab‐
180              lishing a communications path. This name  will  be
181              used  as  an argument to the gethostbyname() func‐
182              tion for identification.  For  example,  "elx0000"
183              might  be  used  to designate the Ethernet address
184              for node "lx0000".  By default the DbdAddr will be
185              identical in value to DbdHost.
186
187
188       DbdHost
189              The  short, or long, name of the machine where the
190              Slurm Database Daemon is executed (i.e.  the  name
191              returned  by  the  command  "hostname  -s").  This
192              value must be specified.
193
194
195       DbdPort
196              The port number that  the  Slurm  Database  Daemon
197              (slurmdbd)  listens to for work. The default value
198              is SLURMDBD_PORT as established  at  system  build
199              time.  If none is explicitly specified, it will be
200              set to 6819.  This value  must  be  equal  to  the
201              AccountingStoragePort  parameter in the slurm.conf
202              file.
203
204
205       DebugFlags
206              Defines specific subsystems which  should  provide
207              more  detailed event logging.  Multiple subsystems
208              can be  specified  with  comma  separators.   Most
209              DebugFlags  will result in verbose logging for the
210              identified subsystems  and  could  impact  perfor‐
211              mance.   Valid  subsystems  available  today (with
212              more to come) include:
213
214              DB_ARCHIVE       SQL statements/queries when deal‐
215                               ing  with  archiving  and purging
216                               the database.
217
218              DB_ASSOC         SQL statements/queries when deal‐
219                               ing   with  associations  in  the
220                               database.
221
222              DB_EVENT         SQL statements/queries when deal‐
223                               ing  with  (node)  events  in the
224                               database.
225
226              DB_JOB           SQL statements/queries when deal‐
227                               ing with jobs in the database.
228
229              DB_QOS           SQL statements/queries when deal‐
230                               ing with QOS in the database.
231
232              DB_QUERY         SQL statements/queries when deal‐
233                               ing with transactions and such in
234                               the database.
235
236              DB_RESERVATION   SQL statements/queries when deal‐
237                               ing   with  reservations  in  the
238                               database.
239
240              DB_RESOURCE      SQL statements/queries when deal‐
241                               ing  with resources like licenses
242                               in the database.
243
244              DB_STEP          SQL statements/queries when deal‐
245                               ing with steps in the database.
246
247              DB_USAGE         SQL statements/queries when deal‐
248                               ing  with   usage   queries   and
249                               inserts in the database.
250
251              DB_WCKEY         SQL statements/queries when deal‐
252                               ing with wckeys in the database.
253
254              FEDERATION       SQL statements/queries when deal‐
255                               ing with federations in the data‐
256                               base.
257
258
259       DebugLevel
260              The level of detail to provide the Slurm  Database
261              Daemon's logs.  The default value is info.
262
263              quiet     Log nothing
264
265              fatal     Log only fatal errors
266
267              error     Log only errors
268
269              info      Log  errors  and  general  informational
270                        messages
271
272              verbose   Log  errors  and  verbose  informational
273                        messages
274
275              debug     Log  errors  and  verbose  informational
276                        messages and debugging messages
277
278              debug2    Log  errors  and  verbose  informational
279                        messages and more debugging messages
280
281              debug3    Log  errors  and  verbose  informational
282                        messages and even  more  debugging  mes‐
283                        sages
284
285              debug4    Log  errors  and  verbose  informational
286                        messages and even  more  debugging  mes‐
287                        sages
288
289              debug5    Log  errors  and  verbose  informational
290                        messages and even  more  debugging  mes‐
291                        sages
292
293
294       DebugLevelSyslog
295              The  slurmdbd daemon will log events to the syslog
296              file at the specified level of detail. If not set,
297              the  slurmdbd  daemon  will log to syslog at level
298              fatal, unless there is no LogFile and it  is  run‐
299              ning  in the background, in which case it will log
300              to syslog at the level specified by DebugLevel (at
301              fatal in the case that DebugLevel is set to quiet)
302              or it is run in the foreground, when  it  will  be
303              set to quiet.
304
305
306              quiet     Log nothing
307
308              fatal     Log only fatal errors
309
310              error     Log only errors
311
312              info      Log  errors  and  general  informational
313                        messages
314
315              verbose   Log  errors  and  verbose  informational
316                        messages
317
318              debug     Log  errors  and  verbose  informational
319                        messages and debugging messages
320
321              debug2    Log  errors  and  verbose  informational
322                        messages and more debugging messages
323
324              debug3    Log  errors  and  verbose  informational
325                        messages and even  more  debugging  mes‐
326                        sages
327
328              debug4    Log  errors  and  verbose  informational
329                        messages and even  more  debugging  mes‐
330                        sages
331
332              debug5    Log  errors  and  verbose  informational
333                        messages and even  more  debugging  mes‐
334                        sages
335
336
337
338       DefaultQOS
339              When adding a new cluster this will be used as the
340              qos for the cluster unless something is explicitly
341              set by the admin with the create.
342
343
344       LogFile
345              Fully  qualified pathname of a file into which the
346              Slurm Database Daemon's  logs  are  written.   The
347              default  value  is none (performs logging via sys‐
348              log).
349              See the section LOGGING in the slurm.conf man page
350              if a pathname is specified.
351
352
353       LogTimeFormat
354              Format  of  the  timestamp  in slurmdbd log files.
355              Accepted  values  are   "iso8601",   "iso8601_ms",
356              "rfc5424", "rfc5424_ms", "clock", and "short". The
357              values ending in "_ms" differ from the ones  with‐
358              out  in  that  fractional seconds with millisecond
359              precision  are  printed.  The  default  value   is
360              "iso8601_ms".  The  "rfc5424" formats are the same
361              as the "iso8601" formats except that the  timezone
362              value  is  also  shown. The "clock" format shows a
363              timestamp in microseconds  retrieved  with  the  C
364              standard clock() function. The "short" format is a
365              short date and time format. The "thread_id" format
366              shows  the  timestamp  in  the  C standard ctime()
367              function form without the year but  including  the
368              microseconds, the daemon's process ID and the cur‐
369              rent thread ID.
370
371
372       MaxQueryTimeRange
373              Return an error if a query is against too large of
374              a  time  span,  to prevent ill-formed queries from
375              causing  performance  problems  within   SlurmDBD.
376              Default value is INFINITE which allows any queries
377              to proceed.  Accepted time formats are the same as
378              the  MaxTime option in slurm.conf.  User SlurmUser
379              and root are exempt from this  restriction.   Note
380              that  queries  which attempt to return over 3GB of
381              data   will   still   fail   to   complete    with
382              ESLURM_RESULT_TOO_LARGE.
383
384
385       MessageTimeout
386              Time  permitted  for a round-trip communication to
387              complete in seconds. Default value is 10 seconds.
388
389
390       Parameters
391              Contains arbitrary comma separated parameters used
392              to alter the behavior of the slurmdbd.
393
394              PreserveCaseUser
395                     When defining users do not force lower case
396                     which is the default behavior.
397
398
399       PidFile
400              Fully qualified pathname of a file into which  the
401              Slurm  Database  Daemon  may write its process ID.
402              This may be used for automated signal  processing.
403              The default value is "/var/run/slurmdbd.pid".
404
405
406       PluginDir
407              Identifies  the  places in which to look for Slurm
408              plugins.  This is a colon-separated list of direc‐
409              tories,  like  the PATH environment variable.  The
410              default value is "/usr/local/lib/slurm".
411
412
413       PrivateData
414              This controls what type of information  is  hidden
415              from  regular  users.  By default, all information
416              is visible to all users.   User  SlurmUser,  root,
417              and  users  with  AdminLevel=Admin can always view
418              all information.  Multiple values may be specified
419              with   a   comma   separator.   Acceptable  values
420              include:
421
422              accounts
423                     prevents users  from  viewing  any  account
424                     definitions unless they are coordinators of
425                     them.
426
427              events prevents users from viewing event  informa‐
428                     tion  unless  they  have operator status or
429                     above.
430
431              jobs   prevents users  from  viewing  job  records
432                     belonging  to  other  users unless they are
433                     coordinators of the association running the
434                     job when using sacct.
435
436              reservations
437                     restricts  getting  reservation information
438                     to users with operator status and above.
439
440              usage  prevents users from viewing  usage  of  any
441                     other user.  This applys to sreport.
442
443              users  prevents  users from viewing information of
444                     any user other than themselves,  this  also
445                     makes it so users can only see associations
446                     they deal with.  Coordinators can see asso‐
447                     ciations  of all users they are coordinator
448                     of, but can only see themselves when  list‐
449                     ing users.
450
451
452       PurgeEventAfter
453              Events  happening on the cluster over this age are
454              purged from the database.  This includes node down
455              times  and  such.  The time is a numeric value and
456              is a number of months.  If you want to purge  more
457              often  you  can  include "hours", or "days" behind
458              the numeric  value  to  get  those  more  frequent
459              purges  (i.e.  a  value  of  "12hours" would purge
460              everything older than 12 hours).  The purge  takes
461              place  at  the  start  of the each purge interval.
462              For example, if the purge time is  2  months,  the
463              purge would happen at the beginning of each month.
464              If not set (default), then job  step  records  are
465              never purged.
466
467
468       PurgeJobAfter
469              Individual  job  records  over this age are purged
470              from the database.  Aggregated information will be
471              preserved  to  "PurgeUsageAfter".   The  time is a
472              numeric value and is a number of months.   If  you
473              want  to purge more often you can include "hours",
474              or "days" behind the numeric value  to  get  those
475              more  frequent  purges  (i.e. a value of "12hours"
476              would purge everything older than 12 hours).   The
477              purge  takes  place at the start of the each purge
478              interval.  For example, if the  purge  time  is  2
479              months, the purge would happen at the beginning of
480              each  month.   If  not  set  (default),  then  job
481              records are never purged.
482
483
484       PurgeResvAfter
485              Individual  reservation  records over this age are
486              purged from the database.  Aggregated  information
487              will  be preserved to "PurgeUsageAfter".  The time
488              is a numeric value and is a number of months.   If
489              you  want  to  purge  more  often  you can include
490              "hours", or "days" behind the numeric value to get
491              those  more  frequent  purges  (i.e.  a  value  of
492              "12hours" would purge  everything  older  than  12
493              hours).  The purge takes place at the start of the
494              each purge interval.  For example,  if  the  purge
495              time  is  2  months, the purge would happen at the
496              beginning of each month.  If  not  set  (default),
497              then reservation records are never purged.
498
499
500       PurgeStepAfter
501              Individual  job  step  records  over  this age are
502              purged from the database.  Aggregated  information
503              will  be preserved to "PurgeUsageAfter".  The time
504              is a numeric value and is a number of months.   If
505              you  want  to  purge  more  often  you can include
506              "hours", or "days" behind the numeric value to get
507              those  more  frequent  purges  (i.e.  a  value  of
508              "12hours" would purge  everything  older  than  12
509              hours).  The purge takes place at the start of the
510              each purge interval.  For example,  if  the  purge
511              time  is  2  months, the purge would happen at the
512              beginning of each month.  If  not  set  (default),
513              then job step records are never purged.
514
515
516       PurgeSuspendAfter
517              Records  of individual suspend times for jobs over
518              this age are purged from the database.  Aggregated
519              information      will      be     preserved     to
520              "PurgeUsageAfter".  The time is  a  numeric  value
521              and  is  a number of months.  If you want to purge
522              more often you  can  include  "hours",  or  "days"
523              behind  the  numeric  value to get those more fre‐
524              quent purges (i.e.  a  value  of  "12hours"  would
525              purge  everything older than 12 hours).  The purge
526              takes place at the start of the each purge  inter‐
527              val.   For example, if the purge time is 2 months,
528              the purge would happen at the  beginning  of  each
529              month.   If  not  set  (default),  then  job  step
530              records are never purged.
531
532
533       PurgeTXNAfter
534              Records of individual transaction times for trans‐
535              actions  over  this  age are purged from the data‐
536              base.  The time is a numeric value and is a number
537              of  months.   If  you want to purge more often you
538              can include "hours", or "days" behind the  numeric
539              value  to  get  those more frequent purges (i.e. a
540              value of "12hours" would  purge  everything  older
541              than  12  hours).   The  purge  takes place at the
542              start of the each purge interval.  For example, if
543              the purge time is 2 months, the purge would happen
544              at the  beginning  of  each  month.   If  not  set
545              (default), then job step records are never purged.
546
547
548       PurgeUsageAfter
549              Usage  Records  (Cluster,  Association  and WCKey)
550              over this age are purged from the  database.   The
551              time is a numeric value and is a number of months.
552              If you want to purge more often  you  can  include
553              "hours", or "days" behind the numeric value to get
554              those  more  frequent  purges  (i.e.  a  value  of
555              "12hours"  would  purge  everything  older than 12
556              hours).  The purge takes place at the start of the
557              each  purge  interval.   For example, if the purge
558              time is 2 months, the purge would  happen  at  the
559              beginning  of  each  month.  If not set (default),
560              then job step records are never purged.
561
562
563       SlurmUser
564              The name of the user  that  the  slurmctld  daemon
565              executes  as.  This user must exist on the machine
566              executing the Slurm Database Daemon and  have  the
567              same  user ID as the hosts on which slurmctld exe‐
568              cute.  For security purposes, a  user  other  than
569              "root"  is  recommended.   The  default  value  is
570              "root".
571
572
573       StorageHost
574              Define the name of the host the database  is  run‐
575              ning  where  we are going to store the data.  Ide‐
576              ally this should be the  host  on  which  slurmdbd
577              executes.
578
579
580       StorageBackupHost
581              Define the name of the backup host the database is
582              running where we are  going  to  store  the  data.
583              This  can  be viewed as a backup solution when the
584              StorageHost is not responding.  It is  up  to  the
585              backup  solution  to  enforce the coherency of the
586              accounting information between the two hosts. With
587              clustered  database solutions (active/passive HA),
588              you would not need to use this  feature.   Default
589              is none.
590
591
592       StorageLoc
593              Specify  the  name of the database as the location
594              where accounting records are written. Defaults  to
595              "slurm_acct_db".
596
597
598       StoragePass
599              Define  the  password  used  to gain access to the
600              database to store the job accounting data. The '#'
601              character is not permitted in a password.
602
603
604       StoragePort
605              The  port  number  that  the Slurm Database Daemon
606              (slurmdbd) communicates with the database.
607
608
609       StorageType
610              Define  the  accounting  storage  mechanism  type.
611              Acceptable  values  at  present  include "account‐
612              ing_storage/mysql".  The  value  "accounting_stor‐
613              age/mysql"   indicates   that  accounting  records
614              should be written to a MySQL or  MariaDB  database
615              specified by the StorageLoc parameter.  This value
616              must be specified.
617
618
619       StorageUser
620              Define the name of the user we are going  to  con‐
621              nect  to  the  database  with  to  store  the  job
622              accounting data.
623
624
625       TCPTimeout
626              Time permitted for TCP  connection  to  be  estab‐
627              lished. Default value is 2 seconds.
628
629
630       TrackWCKey
631              Boolean  yes or no.  Used to set display and track
632              of the Workload Characterization Key. Must be  set
633              to  track wckey usage.  This must be set to gener‐
634              ate rolled up usage tables from WCKeys.  NOTE:  If
635              TrackWCKey  is  set  here  and not in your various
636              slurm.conf files all jobs will  be  attributed  to
637              their default WCKey.
638
639
640       TrackSlurmctldDown
641              Boolean  yes or no.  If set the slurmdbd will mark
642              all idle resources on the cluster as down  when  a
643              slurmctld  disconnects  or is no longer reachable.
644              The default is no.
645
646

EXAMPLE

648       #
649       # Sample /etc/slurmdbd.conf
650       #
651       ArchiveEvents=yes
652       ArchiveJobs=yes
653       ArchiveResvs=yes
654       ArchiveSteps=no
655       ArchiveSuspend=no
656       ArchiveTXN=no
657       ArchiveUsage=no
658       #ArchiveScript=/usr/sbin/slurm.dbd.archive
659       AuthInfo=/var/run/munge/munge.socket.2
660       AuthType=auth/munge
661       DbdHost=db_host
662       DebugLevel=info
663       PurgeEventAfter=1month
664       PurgeJobAfter=12month
665       PurgeResvAfter=1month
666       PurgeStepAfter=1month
667       PurgeSuspendAfter=1month
668       PurgeTXNAfter=12month
669       PurgeUsageAfter=24month
670       LogFile=/var/log/slurmdbd.log
671       PidFile=/var/tmp/jette/slurmdbd.pid
672       SlurmUser=slurm_mgr
673       StoragePass=shazaam
674       StorageType=accounting_storage/mysql
675       StorageUser=database_mgr
676
677

COPYING

679       Copyright (C) 2008-2010 Lawrence Livermore National Secu‐
680       rity.  Produced at Lawrence Livermore National Laboratory
681       (cf, DISCLAIMER).
682       Copyright (C) 2010-2014 SchedMD LLC.
683
684       This file is part of Slurm, a  resource  management  pro‐
685       gram.  For details, see <https://slurm.schedmd.com/>.
686
687       Slurm  is  free  software; you can redistribute it and/or
688       modify it under the  terms  of  the  GNU  General  Public
689       License  as  published  by  the Free Software Foundation;
690       either version 2 of the License, or (at your option)  any
691       later version.
692
693       Slurm  is distributed in the hope that it will be useful,
694       but WITHOUT ANY WARRANTY; without even the  implied  war‐
695       ranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PUR‐
696       POSE.  See  the  GNU  General  Public  License  for  more
697       details.
698
699

FILES

701       /etc/slurmdbd.conf
702
703

SEE ALSO

705       slurm.conf(5), slurmctld(8), slurmdbd(8) syslog (2)
706
707
708
709August 2018                Slurm Configuration File           slurmdbd.conf(5)
Impressum