1SPANK(8)                        Slurm Component                       SPANK(8)
2
3
4

NAME

6       SPANK - Slurm Plug-in Architecture for Node and job (K)control
7
8

DESCRIPTION

10       This manual briefly describes the capabilities of the Slurm Plug-in ar‐
11       chitecture for Node and job Kontrol (SPANK) as well as the  SPANK  con‐
12       figuration file: (By default: plugstack.conf.)
13
14       SPANK  provides  a  very generic interface for stackable plug-ins which
15       may be used to dynamically modify the job launch code in  Slurm.  SPANK
16       plugins  may  be  built  without access to Slurm source code. They need
17       only be compiled against Slurm's spank.h  header  file,  added  to  the
18       SPANK  config  file  plugstack.conf, and they will be loaded at runtime
19       during the next job launch. Thus, the SPANK infrastructure provides ad‐
20       ministrators and other developers a low cost, low effort ability to dy‐
21       namically modify the runtime behavior of Slurm job launch.
22
23       Note: SPANK plugins using the Slurm APIs need to be recompiled when up‐
24       grading Slurm to a new major release.
25

SPANK PLUGINS

27       SPANK plugins are loaded in up to five separate contexts during a Slurm
28       job. Briefly, the five contexts are:
29
30       local   In local context, the plugin is loaded by srun. (i.e. the  "lo‐
31               cal" part of a parallel job).
32
33       remote  In  remote  context,  the plugin is loaded by slurmstepd. (i.e.
34               the "remote" part of a parallel job).
35
36       allocator
37               In allocator context, the plugin is loaded in one  of  the  job
38               allocation utilities sbatch or salloc.
39
40       slurmd  In  slurmd  context,  the plugin is loaded in the slurmd daemon
41               itself. Note: Plugins loaded in slurmd context persist for  the
42               entire  time  slurmd is running, so if configuration is changed
43               or plugins are  updated,  slurmd  must  be  restarted  for  the
44               changes to take effect.
45
46       job_script
47               In the job_script context, plugins are loaded in the context of
48               the  job  prolog  or  epilog.  Note:  Plugins  are  loaded   in
49               job_script  context on each run on the job prolog or epilog, in
50               a separate address space from plugins in slurmd  context.  This
51               means  there  is no state shared between this context and other
52               contexts, or even between one call to slurm_spank_job_prolog or
53               slurm_spank_job_epilog and subsequent calls.
54
55       In   local  context,  only  the  init,  exit,  init_post_opt,  and  lo‐
56       cal_user_init functions are called.  In  allocator  context,  only  the
57       init,  exit,  and  init_post_opt  functions  are called.  Similarly, in
58       slurmd context, only the init and slurmd_exit callbacks are active, and
59       in the job_script context, only the job_prolog and job_epilog callbacks
60       are used.  Plugins may query the context in which they are running with
61       the    spank_context    and    spank_remote    functions   defined   in
62       <slurm/spank.h>.
63
64       SPANK plugins may be called from multiple points during the  Slurm  job
65       launch. A plugin may define the following functions:
66
67       slurm_spank_init
68         Called just after plugins are loaded. In remote context, this is just
69         after job step is initialized. This function  is  called  before  any
70         plugin option processing.
71
72       slurm_spank_job_prolog
73         Called at the same time as the job prolog. If this function returns a
74         negative value and the SPANK plugin that contains it is  required  in
75         the plugstack.conf, the node that this is run on will be drained.
76
77
78       slurm_spank_init_post_opt
79         Called  at the same point as slurm_spank_init, but after all user op‐
80         tions to the plugin have been processed. The reason that the init and
81         init_post_opt  callbacks are separated is so that plugins can process
82         system-wide options specified in plugstack.conf in the init callback,
83         then   process   user  options,  and  finally  take  some  action  in
84         slurm_spank_init_post_opt if necessary.  In the case of  a  heteroge‐
85         neous job, slurm_spank_init is invoked once per job component.
86
87       slurm_spank_local_user_init
88         Called  in local (srun) context only after all options have been pro‐
89         cessed.  This is called after the job ID and step IDs are  available.
90         This  happens  in srun after the allocation is made, but before tasks
91         are launched.
92
93       slurm_spank_user_init
94         Called after privileges  are  temporarily  dropped.  (remote  context
95         only)
96
97       slurm_spank_task_init_privileged
98         Called  for each task just after fork, but before all elevated privi‐
99         leges are dropped. (remote context only)
100
101       slurm_spank_task_init
102         Called for each task just before execve (2). If  you  are  restricing
103         memory  with  cgroups,  memory  allocated  here  will be in the job's
104         cgroup. (remote context only)
105
106       slurm_spank_task_post_fork
107         Called for each task from parent process after fork (2) is  complete.
108         Due  to  the fact that slurmd does not exec any tasks until all tasks
109         have completed fork (2), this call is guaranteed to  run  before  the
110         user task is executed. (remote context only)
111
112       slurm_spank_task_exit
113         Called  for each task as its exit status is collected by Slurm.  (re‐
114         mote context only)
115
116       slurm_spank_exit
117         Called once just before slurmstepd exits in remote context.  In local
118         context, called before srun exits.
119
120       slurm_spank_job_epilog
121         Called at the same time as the job epilog. If this function returns a
122         negative value and the SPANK plugin that contains it is  required  in
123         the plugstack.conf, the node that this is run on will be drained.
124
125       slurm_spank_slurmd_exit
126         Called in slurmd when the daemon is shut down.
127
128       All of these functions have the same prototype, for example:
129
130          int slurm_spank_init (spank_t spank, int ac, char *argv[])
131
132
133       Where spank is the SPANK handle which must be passed back to Slurm when
134       the plugin calls functions like spank_get_item and  spank_getenv.  Con‐
135       figured  arguments (See CONFIGURATION below) are passed in the argument
136       vector argv with argument count ac.
137
138       SPANK plugins can query the current list of supported slurm_spank  sym‐
139       bols  to determine if the current version supports a given plugin hook.
140       This may be useful because the list of plugin symbols may grow  in  the
141       future.  The  query  is done using the spank_symbol_supported function,
142       which has the following prototype:
143
144           int spank_symbol_supported (const char *sym);
145
146
147       The return value is 1 if the symbol is supported, 0 if not.
148
149       SPANK plugins do not have direct access  to  internally  defined  Slurm
150       data structures. Instead, information about the currently executing job
151       is obtained via the spank_get_item function call.
152
153         spank_err_t spank_get_item (spank_t spank, spank_item_t item, ...);
154
155       The spank_get_item call must be passed the current SPANK handle as well
156       as  the  item requested, which is defined by the passed spank_item_t. A
157       variable number of pointer arguments  are  also  passed,  depending  on
158       which  item was requested by the plugin. A list of the valid values for
159       item is kept in the spank.h header file. Some examples are:
160
161       S_JOB_UID
162         User id for running job. (uid_t *) is third arg of spank_get_item
163
164       S_JOB_STEPID
165         Job  step  id  for  running  job.  (uint32_t  *)  is  third  arg   of
166         spank_get_item.
167
168       S_TASK_EXIT_STATUS
169         Exit  status  for exited task. Only valid from slurm_spank_task_exit.
170         (int *) is third arg of spank_get_item.
171
172       S_JOB_ARGV
173         Complete job command line. Third and fourth  args  to  spank_get_item
174         are (int *, char ***).
175
176       See  spank.h  for  more  details,  and EXAMPLES below for an example of
177       spank_get_item usage.
178
179       SPANK functions in the local and allocator environment should  use  the
180       getenv, setenv, and unsetenv functions to view and modify the job's en‐
181       vironment.  SPANK functions in the remote environment  should  use  the
182       spank_getenv,  spank_setenv,  and  spank_unsetenv functions to view and
183       modify the job's environment. spank_getenv searches the job's  environ‐
184       ment for the environment variable var and copies the current value into
185       a buffer buf of length len.  spank_setenv allows a SPANK plugin to  set
186       or  overwrite  a  variable in the job's environment, and spank_unsetenv
187       unsets an environment variable in the job's environment. The prototypes
188       are:
189
190        spank_err_t spank_getenv (spank_t spank, const char *var,
191                            char *buf, int len);
192        spank_err_t spank_setenv (spank_t spank, const char *var,
193                            const char *val, int overwrite);
194        spank_err_t spank_unsetenv (spank_t spank, const char *var);
195
196       These  are  only necessary in remote context since modifications of the
197       standard process environment using setenv (3), getenv (3), and unsetenv
198       (3) may be used in local context.
199
200       Functions are also available from within the SPANK plugins to establish
201       environment variables to be exported to the Slurm PrologSlurmctld, Pro‐
202       log, Epilog and EpilogSlurmctld programs (the so-called job control en‐
203       vironment).  The name of environment  variables  established  by  these
204       calls  will  be  prepended with the string SPANK_ in order to avoid any
205       security implications of arbitrary environment variable control. (After
206       all, the job control scripts do run as root or the Slurm user.).
207
208       These functions are available from local context only.
209
210         spank_err_t spank_job_control_getenv(spank_t spank, const char *var,
211                              char *buf, int len);
212         spank_err_t spank_job_control_setenv(spank_t spank, const char *var,
213                              const char *val, int overwrite);
214         spank_err_t spank_job_control_unsetenv(spank_t spank, const char *var);
215
216       See spank.h for more information, and EXAMPLES below for an example for
217       spank_getenv usage.
218
219       Many of the described SPANK functions available to plugins  return  er‐
220       rors  via the spank_err_t error type. On success, the return value will
221       be set to ESPANK_SUCCESS, while on failure, the return  value  will  be
222       set to one of many error values defined in slurm/spank.h. The SPANK in‐
223       terface provides a simple function
224
225         const char * spank_strerror(spank_err_t err);
226
227       which may be used to translate a spank_err_t value into its string rep‐
228       resentation.
229
230
231       The  slurm_spank_log function can be used to print messages back to the
232       user at an error level.  This is to keep users from having to  rely  on
233       the  slurm_error  function,  which can be confusing because it prepends
234       "error:" to every message.
235
236

SPANK OPTIONS

238       SPANK plugins also have an interface through which they may define  and
239       implement  extra  job  options. These options are made available to the
240       user through Slurm commands such as srun(1), salloc(1), and  sbatch(1).
241       If the option is specified by the user, its value is forwarded and reg‐
242       istered with the plugin in slurmd when the job is run.   In  this  way,
243       SPANK  plugins may dynamically provide new options and functionality to
244       Slurm.
245
246       Each option registered by a plugin to Slurm takes the form of a  struct
247       spank_option which is declared in <slurm/spank.h> as
248
249          struct spank_option {
250             char *         name;
251             char *         arginfo;
252             char *         usage;
253             int            has_arg;
254             int            val;
255             spank_opt_cb_f cb;
256          };
257
258
259       Where
260
261       name   is  the  name  of the option. Its length is limited to SPANK_OP‐
262              TION_MAXLEN defined in <slurm/spank.h>.
263
264       arginfo
265              is a description of the argument to the option,  if  the  option
266              does take an argument.
267
268       usage  is a short description of the option suitable for --help output.
269
270       has_arg
271              0  if  option  takes no argument, 1 if option takes an argument,
272              and 2 if the option takes an optional argument. (See getopt_long
273              (3)).
274
275       val    A plugin-local value to return to the option callback function.
276
277       cb     A  callback  function  that is invoked when the plugin option is
278              registered  with   Slurm.   spank_opt_cb_f   is   typedef'd   in
279              <slurm/spank.h> as
280
281                typedef int (*spank_opt_cb_f) (int val, const char *optarg,
282                                         int remote);
283
284              Where  val  is  the  value  of the val field in the spank_option
285              struct, optarg is the supplied argument if applicable,  and  re‐
286              mote  is 0 if the function is being called from the "local" host
287              (e.g. host where srun or sbatch/salloc are invoked)  or  1  from
288              the  "remote"  host  (host where slurmd/slurmstepd run) but only
289              executed by slurmstepd (remote context) if the option was regis‐
290              tered for such context.
291
292       Plugin options may be registered with Slurm using the spank_option_reg‐
293       ister function. This function  is  only  valid  when  called  from  the
294       plugin's  slurm_spank_init handler, and registers one option at a time.
295       The prototype is
296
297          spank_err_t spank_option_register (spank_t sp,
298                    struct spank_option *opt);
299
300       This function will return ESPANK_SUCCESS on successful registration  of
301       an  option, or ESPANK_BAD_ARG for errors including invalid spank_t han‐
302       dle, or when the function is not called from the slurm_spank_init func‐
303       tion. All options need to be registered from all contexts in which they
304       will be used. For instance, if an option is only used in  local  (srun)
305       and remote (slurmd) contexts, then spank_option_register should only be
306       called from within those contexts. For example:
307
308          if (spank_context() != S_CTX_ALLOCATOR)
309             spank_option_register (sp, opt);
310
311       If, however, the option is used in all contexts, the  spank_option_reg‐
312       ister needs to be called everywhere.
313
314       In  addition  to spank_option_register, plugins may also export options
315       to Slurm by defining a table of struct  spank_option  with  the  symbol
316       name spank_options. This method, however, is not supported for use with
317       sbatch and salloc  (allocator  context),  thus  the  use  of  spank_op‐
318       tion_register is preferred. When using the spank_options table, the fi‐
319       nal element in the array must be filled with zeros. A SPANK_OPTIONS_TA‐
320       BLE_END macro is provided in <slurm/spank.h> for this purpose.
321
322       When  an  option  is  provided by the user on the local side, either by
323       command line options or by environment variables,  Slurm  will  immedi‐
324       ately invoke the option's callback with remote=0. This is meant for the
325       plugin to do local sanity checking of the option before  the  value  is
326       sent  to  the  remote  side during job launch. If the argument the user
327       specified is invalid, the plugin should issue  an  error  and  issue  a
328       non-zero  return  code  from the callback. The plugin should be able to
329       handle cases where the spank option is set multiple times through envi‐
330       ronment  variables  and command line options. Environment variables are
331       processed before command line options.
332
333       On the remote side, options and their arguments are registered just af‐
334       ter  SPANK  plugins  are  loaded  and  before the spank_init handler is
335       called. This allows plugins to modify behavior of all plugin  function‐
336       ality based on the value of user-provided options.  (See EXAMPLES below
337       for a plugin that registers an option with Slurm).
338
339       As an alternative to use of an option  callback  and  global  variable,
340       plugins  can  use  the spank_option_getopt option to check for supplied
341       options after option processing. This function has the prototype:
342
343          spank_err_t spank_option_getopt(spank_t sp,
344              struct spank_option *opt, char **optargp);
345
346       This function returns ESPANK_SUCCESS if the option defined in the
347       struct spank_option opt has been used by the user. If optargp
348       is non-NULL then it is set to any option argument passed (if the option
349       takes an argument). The use of this method is required to process
350       options in job_script context (slurm_spank_job_prolog and
351       slurm_spank_job_epilog). This function is valid in the following contexts:
352       slurm_spank_job_prolog, slurm_spank_local_user_init, slurm_spank_user_init,
353       slurm_spank_task_init_privileged, slurm_spank_task_init, slurm_spank_task_exit,
354       and slurm_spank_job_epilog.
355
356

CONFIGURATION

358       The default SPANK plug-in stack configuration file is plugstack.conf in
359       the same directory as slurm.conf(5), though this may be changed via the
360       Slurm config parameter PlugStackConfig.   Normally  the  plugstack.conf
361       file  should be identical on all nodes of the cluster.  The config file
362       lists SPANK plugins, one per line, along with whether the plugin is re‐
363       quired  or  optional, and any global arguments that are to be passed to
364       the plugin for runtime configuration.  Comments are preceded  with  '#'
365       and  extend to the end of the line.  If the configuration file is miss‐
366       ing or empty, it will simply be ignored.
367
368       The format of each non-comment line in the configuration file is:
369
370         required/optional   plugin   arguments
371
372        For example:
373
374         optional /usr/lib/slurm/test.so
375
376       Tells slurmd to load the plugin test.so passing  no  arguments.   If  a
377       SPANK plugin is required, then failure of any of the plugin's functions
378       will cause slurmd to terminate the job,  while  optional  plugins  only
379       cause a warning.
380
381       If  a fully-qualified path is not specified for a plugin, then the cur‐
382       rently configured PluginDir in slurm.conf(5) is searched.
383
384       SPANK plugins are stackable, meaning that more than one plugin  may  be
385       placed  into  the config file. The plugins will simply be called in or‐
386       der, one after the other, and appropriate action taken on failure given
387       that state of the plugin's optional flag.
388
389       Additional  config files or directories of config files may be included
390       in plugstack.conf with the include keyword. The  include  keyword  must
391       appear  on its own line, and takes a glob as its parameter, so multiple
392       files may be included from one include line. For example, the following
393       syntax  will  load  all config files in the /etc/slurm/plugstack.conf.d
394       directory, in local collation order:
395
396         include /etc/slurm/plugstack.conf.d/*
397
398       which might be considered a more flexible  method  for  building  up  a
399       spank plugin stack.
400
401       The  SPANK  config  file  is re-read on each job launch, so editing the
402       config file will not affect running jobs. However care should be  taken
403       so that a partially edited config file is not read by a launching job.
404
405

EXAMPLE: renice.so

407       /etc/slurm/plugstack.conf:
408              This  example plugstack.conf file shows a configuration that ac‐
409              tivates the renice.so SPANK plugin.
410              #
411              # SPANK config file
412              #
413              # required?       plugin                     parameters
414              #
415              optional          /usr/lib/SPANK_renice.so   min_prio=-10
416
417       /usr/local/src/renice.c:
418              A sample SPANK plugin to modify the nice  value  of  job  tasks.
419              This  plugin  adds  a --renice=[prio] option to srun which users
420              can use to set the priority of all remote  tasks.  Priority  may
421              also  be  specified  via  a SLURM_RENICE environment variable. A
422              minimum priority may be established via a  "min_prio"  parameter
423              in plugstack.conf.
424              #include <sys/types.h>
425              #include <stdio.h>
426              #include <stdlib.h>
427              #include <unistd.h>
428              #include <string.h>
429              #include <sys/resource.h>
430
431              #include <slurm/spank.h>
432
433              /*
434               * All spank plugins must define this macro for the
435               * Slurm plugin loader.
436               */
437              SPANK_PLUGIN(renice, 1);
438
439              #define PRIO_ENV_VAR "SLURM_RENICE"
440              #define PRIO_NOT_SET -1
441
442              /*
443               * Minimum allowable value for priority. May be
444               * set globally via plugin option min_prio=<prio>
445               */
446              static int min_prio = -20;
447
448              static int prio = PRIO_NOT_SET;
449
450              static int _renice_opt_process(int val, const char *optarg, int remote);
451              static int _str2prio(const char *str, int *p2int);
452
453              /*
454               *  Provide a --renice=[prio] option to srun:
455               */
456              struct spank_option spank_options[] =
457              {
458                  {
459                      "renice",
460                      "[prio]",
461                      "Re-nice job tasks to priority [prio].",
462                      2,
463                      0,
464                      _renice_opt_process
465                  },
466                  SPANK_OPTIONS_TABLE_END
467              };
468
469              /*
470               *  Called from both srun and slurmd.
471               */
472              int slurm_spank_init(spank_t sp, int ac, char **av)
473              {
474                  int i;
475
476                  /* Don't do anything in sbatch/salloc */
477                  if (spank_context () == S_CTX_ALLOCATOR)
478                      return ESPANK_SUCCESS;
479
480                  for (i = 0; i < ac; i++) {
481                      if (!strncmp("min_prio=", av[i], 9)) {
482                          const char *optarg = av[i] + 9;
483
484                          if (_str2prio(optarg, &min_prio))
485                              slurm_error ("Ignoring invalid min_prio value: %s", av[i]);
486                      } else {
487                          slurm_error ("renice: Invalid option: %s", av[i]);
488                      }
489                  }
490
491                  if (!spank_remote(sp))
492                      slurm_verbose("renice: min_prio = %d", min_prio);
493
494                  return ESPANK_SUCCESS;
495              }
496
497              int slurm_spank_task_post_fork(spank_t sp, int ac, char **av)
498              {
499                  int rc;
500                  pid_t pid;
501                  int taskid;
502
503                  if (prio == PRIO_NOT_SET) {
504                      /* See if SLURM_RENICE env var is set by user */
505                      char val[1024];
506
507                      rc = spank_getenv(sp, PRIO_ENV_VAR, val, sizeof(val));
508
509                      if (rc)
510                          return rc;
511
512                      rc = _str2prio(val, &prio);
513
514                      if (rc) {
515                          slurm_error("Bad value for %s: %s", PRIO_ENV_VAR, optarg);
516                          return rc;
517                      }
518
519                      if (prio < min_prio) {
520                          slurm_error("%s=%d not allowed, using min=%d",
521                                      PRIO_ENV_VAR, prio, min_prio);
522                      }
523                  }
524
525                  if (prio < min_prio)
526                      prio = min_prio;
527
528                  spank_get_item(sp, S_TASK_GLOBAL_ID, &taskid);
529                  spank_get_item(sp, S_TASK_PID, &pid);
530
531                  slurm_info("re-nicing task%d pid %d to %d", taskid, (int) pid, prio);
532
533                  if (setpriority(PRIO_PROCESS, (int) pid, (int) prio)) {
534                      slurm_error("setpriority: %m");
535                      return -ESPANK_ERROR;
536                  }
537
538                  return ESPANK_SUCCESS;
539              }
540
541              static int _str2prio(const char *str, int *p2int)
542              {
543                  long l;
544                  char *p = NULL;
545
546                  if (!str || str[0] == '\0')
547                      return -ESPANK_BAD_ARG;
548
549                  l = strtol(str, &p, 10);
550
551                  if (!p || (*p != '\0'))
552                      return -ESPANK_BAD_ARG;
553
554                  if ((l < -20) || (l > 20)) {
555                      slurm_error("Specify value between -20 and 20");
556                      return -ESPANK_BAD_ARG;
557                  }
558
559                  *p2int = (int) l;
560
561                  return ESPANK_SUCCESS;
562              }
563
564              static int _renice_opt_process(int val, const char *optarg, int remote)
565              {
566                  int rc;
567
568                  if (optarg == NULL) {
569                      slurm_error("renice: invalid NULL argument!");
570                      return -ESPANK_BAD_ARG;
571                  }
572
573                  if ((rc = _str2prio(optarg, &prio))) {
574                      slurm_error("Bad value for --renice: %s", optarg);
575                      return rc;
576                  }
577
578                  if (prio < min_prio) {
579                      slurm_error("--renice=%d not allowed, will use min=%d",
580                                  prio, min_prio);
581                  }
582
583                  return ESPANK_SUCCESS;
584              }
585
586       Compile command:
587              # gcc -ggdb3 -I${SLURM_PATH}/include/ -fPIC -shared -o /usr/lib/SPANK_renice.so /usr/local/src/renice.c
588
589

COPYING

591       Portions  copyright  (C) 2010-2018 SchedMD LLC.  Copyright (C) 2006 The
592       Regents of the University of California.  Produced at  Lawrence  Liver‐
593       more  National  Laboratory  (cf,  DISCLAIMER).   CODE-OCEC-09-009.  All
594       rights reserved.
595
596       This file is part of Slurm, a resource  management  program.   For  de‐
597       tails, see <https://slurm.schedmd.com/>.
598
599       Slurm  is free software; you can redistribute it and/or modify it under
600       the terms of the GNU General Public License as published  by  the  Free
601       Software  Foundation;  either version 2 of the License, or (at your op‐
602       tion) any later version.
603
604       Slurm is distributed in the hope that it will be  useful,  but  WITHOUT
605       ANY  WARRANTY;  without even the implied warranty of MERCHANTABILITY or
606       FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General  Public  License
607       for more details.
608

FILES

610       /etc/slurm/slurm.conf - Slurm configuration file.
611       /etc/slurm/plugstack.conf - SPANK configuration file.
612       /usr/include/slurm/spank.h - SPANK header file.
613

SEE ALSO

615       srun(1), slurm.conf(5)
616
617
618
619April 2021                      Slurm Component                       SPANK(8)
Impressum