1SPANK(8)                        Slurm Component                       SPANK(8)
2
3
4

NAME

6       SPANK - Slurm Plug-in Architecture for Node and job (K)control
7
8

DESCRIPTION

10       This  manual  briefly  describes  the capabilities of the Slurm Plug-in
11       architecture for Node and job Kontrol (SPANK) as well as the SPANK con‐
12       figuration file: (By default: plugstack.conf.)
13
14       SPANK  provides  a  very generic interface for stackable plug-ins which
15       may be used to dynamically modify the job launch code in  Slurm.  SPANK
16       plugins  may  be  built  without access to Slurm source code. They need
17       only be compiled against Slurm's spank.h  header  file,  added  to  the
18       SPANK  config  file  plugstack.conf, and they will be loaded at runtime
19       during the next job launch. Thus,  the  SPANK  infrastructure  provides
20       administrators  and  other developers a low cost, low effort ability to
21       dynamically modify the runtime behavior of Slurm job launch.
22
23       Note: SPANK plugins using the Slurm APIs need  to  be  recompiled  when
24       upgrading Slurm to a new major release.
25

SPANK PLUGINS

27       SPANK plugins are loaded in up to five separate contexts during a Slurm
28       job. Briefly, the five contexts are:
29
30       local   In local context, the plugin  is  loaded  by  srun.  (i.e.  the
31               "local" part of a parallel job).
32
33       remote  In  remote  context,  the plugin is loaded by slurmstepd. (i.e.
34               the "remote" part of a parallel job).
35
36       allocator
37               In allocator context, the plugin is loaded in one  of  the  job
38               allocation utilities sbatch or salloc.
39
40       slurmd In slurmd context, the plugin is loaded in the
41              slurmd  daemon  itself.  Note:  Plugins loaded in slurmd context
42              persist for the entire time slurmd is running, so if  configura‐
43              tion is changed or plugins are updated, slurmd must be restarted
44              for the changes to take effect.
45
46       job_script
47              In the job_script context, plugins are loaded in the context  of
48              the job prolog or epilog. Note: Plugins are loaded in job_script
49              context on each run on the job prolog or epilog, in  a  separate
50              address  space  from plugins in slurmd context. This means there
51              is no state shared between this context and other  contexts,  or
52              even    between    one   call   to   slurm_spank_job_prolog   or
53              slurm_spank_job_epilog and subsequent calls.
54
55       In  local  context,   only   the   init,   exit,   init_post_opt,   and
56       local_user_init  functions  are  called. In allocator context, only the
57       init, exit, and init_post_opt  functions  are  called.   Similarly,  in
58       slurmd context, only the init and slurmd_exit callbacks are active, and
59       in the job_script context, only the job_prolog and job_epilog callbacks
60       are used.  Plugins may query the context in which they are running with
61       the   spank_context   and    spank_remote    functions    defined    in
62       <slurm/spank.h>.
63
64       SPANK  plugins  may be called from multiple points during the Slurm job
65       launch. A plugin may define the following functions:
66
67       slurm_spank_init
68         Called just after plugins are loaded. In remote context, this is just
69         after  job  step  is  initialized. This function is called before any
70         plugin option processing.
71
72       slurm_spank_job_prolog
73         Called at the same time as the job prolog. If this function returns a
74         negative  value  and the SPANK plugin that contains it is required in
75         the plugstack.conf, the node that this is run on will be drained.
76
77
78       slurm_spank_init_post_opt
79         Called at the same point as  slurm_spank_init,  but  after  all  user
80         options  to  the plugin have been processed. The reason that the init
81         and init_post_opt callbacks are separated  is  so  that  plugins  can
82         process  system-wide  options specified in plugstack.conf in the init
83         callback, then process user options, and finally take some action  in
84         slurm_spank_init_post_opt  if  necessary.  In the case of a heteroge‐
85         neous job, slurm_spank_init is invoked once per job component.
86
87       slurm_spank_local_user_init
88         Called in local (srun) context only after all options have been  pro‐
89         cessed.   This is called after the job ID and step IDs are available.
90         This happens in srun after the allocation is made, but  before  tasks
91         are launched.
92
93       slurm_spank_user_init
94         Called  after  privileges  are  temporarily  dropped. (remote context
95         only)
96
97       slurm_spank_task_init_privileged
98         Called for each task just after fork, but before all elevated  privi‐
99         leges are dropped. (remote context only)
100
101       slurm_spank_task_init
102         Called  for  each  task just before execve (2). If you are restricing
103         memory with cgroups, memory allocated  here  will  be  in  the  job's
104         cgroup. (remote context only)
105
106       slurm_spank_task_post_fork
107         Called  for each task from parent process after fork (2) is complete.
108         Due to the fact that slurmd does not exec any tasks until  all  tasks
109         have  completed  fork  (2), this call is guaranteed to run before the
110         user task is executed. (remote context only)
111
112       slurm_spank_task_exit
113         Called for each task as  its  exit  status  is  collected  by  Slurm.
114         (remote context only)
115
116       slurm_spank_exit
117         Called once just before slurmstepd exits in remote context.  In local
118         context, called before srun exits.
119
120       slurm_spank_job_epilog
121         Called at the same time as the job epilog. If this function returns a
122         negative  value  and the SPANK plugin that contains it is required in
123         the plugstack.conf, the node that this is run on will be drained.
124
125       slurm_spank_slurmd_exit
126         Called in slurmd when the daemon is shut down.
127
128       All of these functions have the same prototype, for example:
129
130          int slurm_spank_init (spank_t spank, int ac, char *argv[])
131
132
133       Where spank is the SPANK handle which must be passed back to Slurm when
134       the  plugin  calls functions like spank_get_item and spank_getenv. Con‐
135       figured arguments (See CONFIGURATION below) are passed in the  argument
136       vector argv with argument count ac.
137
138       SPANK  plugins can query the current list of supported slurm_spank sym‐
139       bols to determine if the current version supports a given plugin  hook.
140       This  may  be useful because the list of plugin symbols may grow in the
141       future. The query is done using  the  spank_symbol_supported  function,
142       which has the following prototype:
143
144           int spank_symbol_supported (const char *sym);
145
146
147       The return value is 1 if the symbol is supported, 0 if not.
148
149       SPANK  plugins  do  not  have direct access to internally defined Slurm
150       data structures. Instead, information about the currently executing job
151       is obtained via the spank_get_item function call.
152
153         spank_err_t spank_get_item (spank_t spank, spank_item_t item, ...);
154
155       The spank_get_item call must be passed the current SPANK handle as well
156       as the item requested, which is defined by the passed  spank_item_t.  A
157       variable  number  of  pointer  arguments  are also passed, depending on
158       which item was requested by the plugin. A list of the valid values  for
159       item is kept in the spank.h header file. Some examples are:
160
161       S_JOB_UID
162         User id for running job. (uid_t *) is third arg of spank_get_item
163
164       S_JOB_STEPID
165         Job   step  id  for  running  job.  (uint32_t  *)  is  third  arg  of
166         spank_get_item.
167
168       S_TASK_EXIT_STATUS
169         Exit status for exited task. Only valid  from  slurm_spank_task_exit.
170         (int *) is third arg of spank_get_item.
171
172       S_JOB_ARGV
173         Complete  job  command  line. Third and fourth args to spank_get_item
174         are (int *, char ***).
175
176       See spank.h for more details, and EXAMPLES  below  for  an  example  of
177       spank_get_item usage.
178
179       SPANK  functions  in the local and allocator environment should use the
180       getenv, setenv, and unsetenv functions to view  and  modify  the  job's
181       environment.   SPANK functions in the remote environment should use the
182       spank_getenv, spank_setenv, and spank_unsetenv functions  to  view  and
183       modify  the job's environment. spank_getenv searches the job's environ‐
184       ment for the environment variable var and copies the current value into
185       a  buffer buf of length len.  spank_setenv allows a SPANK plugin to set
186       or overwrite a variable in the job's  environment,  and  spank_unsetenv
187       unsets an environment variable in the job's environment. The prototypes
188       are:
189
190        spank_err_t spank_getenv (spank_t spank, const char *var,
191                            char *buf, int len);
192        spank_err_t spank_setenv (spank_t spank, const char *var,
193                            const char *val, int overwrite);
194        spank_err_t spank_unsetenv (spank_t spank, const char *var);
195
196       These are only necessary in remote context since modifications  of  the
197       standard process environment using setenv (3), getenv (3), and unsetenv
198       (3) may be used in local context.
199
200       Functions are also available from within the SPANK plugins to establish
201       environment variables to be exported to the Slurm PrologSlurmctld, Pro‐
202       log, Epilog and EpilogSlurmctld programs  (the  so-called  job  control
203       environment).   The  name of environment variables established by these
204       calls will be prepended with the string SPANK_ in order  to  avoid  any
205       security implications of arbitrary environment variable control. (After
206       all, the job control scripts do run as root or the Slurm user.).
207
208       These functions are available from local context only.
209
210         spank_err_t spank_job_control_getenv(spank_t spank, const char *var,
211                              char *buf, int len);
212         spank_err_t spank_job_control_setenv(spank_t spank, const char *var,
213                              const char *val, int overwrite);
214         spank_err_t spank_job_control_unsetenv(spank_t spank, const char *var);
215
216       See spank.h for more information, and EXAMPLES below for an example for
217       spank_getenv usage.
218
219       Many  of  the  described  SPANK  functions  available to plugins return
220       errors via the spank_err_t error type. On  success,  the  return  value
221       will  be set to ESPANK_SUCCESS, while on failure, the return value will
222       be set to one of many error values defined in slurm/spank.h. The  SPANK
223       interface provides a simple function
224
225         const char * spank_strerror(spank_err_t err);
226
227       which may be used to translate a spank_err_t value into its string rep‐
228       resentation.
229
230
231       The slurm_spank_log function can be used to print messages back to  the
232       user  at  an error level.  This is to keep users from having to rely on
233       the slurm_error function, which can be confusing  because  it  prepends
234       "error:" to every message.
235
236

SPANK OPTIONS

238       SPANK  plugins also have an interface through which they may define and
239       implement extra job options. These options are made  available  to  the
240       user  through Slurm commands such as srun(1), salloc(1), and sbatch(1).
241       If the option is specified by the user, its value is forwarded and reg‐
242       istered  with  the  plugin in slurmd when the job is run.  In this way,
243       SPANK plugins may dynamically provide new options and functionality  to
244       Slurm.
245
246       Each  option registered by a plugin to Slurm takes the form of a struct
247       spank_option which is declared in <slurm/spank.h> as
248
249          struct spank_option {
250             char *         name;
251             char *         arginfo;
252             char *         usage;
253             int            has_arg;
254             int            val;
255             spank_opt_cb_f cb;
256          };
257
258
259       Where
260
261       name   is  the  name  of  the  option.  Its  length   is   limited   to
262              SPANK_OPTION_MAXLEN defined in <slurm/spank.h>.
263
264       arginfo
265              is  a  description  of the argument to the option, if the option
266              does take an argument.
267
268       usage  is a short description of the option suitable for --help output.
269
270       has_arg
271              0 if option takes no argument, 1 if option  takes  an  argument,
272              and 2 if the option takes an optional argument. (See getopt_long
273              (3)).
274
275       val    A plugin-local value to return to the option callback function.
276
277       cb     A callback function that is invoked when the  plugin  option  is
278              registered   with   Slurm.   spank_opt_cb_f   is   typedef'd  in
279              <slurm/spank.h> as
280
281                typedef int (*spank_opt_cb_f) (int val, const char *optarg,
282                                         int remote);
283
284              Where val is the value of the  val  field  in  the  spank_option
285              struct,  optarg  is  the  supplied  argument  if applicable, and
286              remote is 0 if the function is being  called  from  the  "local"
287              host  (e.g.  host  where srun or sbatch/salloc are invoked) or 1
288              from the "remote" host (host where  slurmd/slurmstepd  run)  but
289              only  executed  by slurmstepd (remote context) if the option was
290              registered for such context.
291
292       Plugin options may be registered with Slurm using the spank_option_reg‐
293       ister  function. This function is only valid when called from the plug‐
294       in's slurm_spank_init handler, and registers one option at a time.  The
295       prototype is
296
297          spank_err_t spank_option_register (spank_t sp,
298                    struct spank_option *opt);
299
300       This  function will return ESPANK_SUCCESS on successful registration of
301       an option, or ESPANK_BAD_ARG for errors including invalid spank_t  han‐
302       dle, or when the function is not called from the slurm_spank_init func‐
303       tion. All options need to be registered from all contexts in which they
304       will  be  used. For instance, if an option is only used in local (srun)
305       and remote (slurmd) contexts, then spank_option_register should only be
306       called from within those contexts. For example:
307
308          if (spank_context() != S_CTX_ALLOCATOR)
309             spank_option_register (sp, opt);
310
311       If,  however, the option is used in all contexts, the spank_option_reg‐
312       ister needs to be called everywhere.
313
314       In addition to spank_option_register, plugins may also  export  options
315       to  Slurm  by  defining  a table of struct spank_option with the symbol
316       name spank_options. This method, however, is not supported for use with
317       sbatch    and   salloc   (allocator   context),   thus   the   use   of
318       spank_option_register is preferred. When using the spank_options table,
319       the   final  element  in  the  array  must  be  filled  with  zeros.  A
320       SPANK_OPTIONS_TABLE_END macro is provided in <slurm/spank.h>  for  this
321       purpose.
322
323       When  an  option  is  provided by the user on the local side, either by
324       command line options or by environment variables,  Slurm  will  immedi‐
325       ately invoke the option's callback with remote=0. This is meant for the
326       plugin to do local sanity checking of the option before  the  value  is
327       sent  to  the  remote  side during job launch. If the argument the user
328       specified is invalid, the plugin should issue  an  error  and  issue  a
329       non-zero  return  code  from the callback. The plugin should be able to
330       handle cases where the spank option is set multiple times through envi‐
331       ronment  variables  and command line options. Environment variables are
332       processed before command line options.
333
334       On the remote side, options and their  arguments  are  registered  just
335       after  SPANK  plugins  are  loaded and before the spank_init handler is
336       called. This allows plugins to modify behavior of all plugin  function‐
337       ality based on the value of user-provided options.  (See EXAMPLES below
338       for a plugin that registers an option with Slurm).
339
340       As an alternative to use of an option  callback  and  global  variable,
341       plugins  can  use  the spank_option_getopt option to check for supplied
342       options after option processing. This function has the prototype:
343
344          spank_err_t spank_option_getopt(spank_t sp,
345              struct spank_option *opt, char **optargp);
346
347       This function returns ESPANK_SUCCESS if the option defined in the
348       struct spank_option opt has been used by the user. If optargp
349       is non-NULL then it is set to any option argument passed (if the option
350       takes an argument). The use of this method is required to process
351       options in job_script context (slurm_spank_job_prolog and
352       slurm_spank_job_epilog). This function is valid in the following contexts:
353       slurm_spank_job_prolog, slurm_spank_local_user_init, slurm_spank_user_init,
354       slurm_spank_task_init_privileged, slurm_spank_task_init, slurm_spank_task_exit,
355       and slurm_spank_job_epilog.
356
357

CONFIGURATION

359       The default SPANK plug-in stack configuration file is plugstack.conf in
360       the same directory as slurm.conf(5), though this may be changed via the
361       Slurm config parameter PlugStackConfig.   Normally  the  plugstack.conf
362       file  should be identical on all nodes of the cluster.  The config file
363       lists SPANK plugins, one per line, along with  whether  the  plugin  is
364       required or optional, and any global arguments that are to be passed to
365       the plugin for runtime configuration.  Comments are preceded  with  '#'
366       and  extend to the end of the line.  If the configuration file is miss‐
367       ing or empty, it will simply be ignored.
368
369       The format of each non-comment line in the configuration file is:
370
371         required/optional   plugin   arguments
372
373        For example:
374
375         optional /usr/lib/slurm/test.so
376
377       Tells slurmd to load the plugin test.so passing  no  arguments.   If  a
378       SPANK plugin is required, then failure of any of the plugin's functions
379       will cause slurmd to terminate the job,  while  optional  plugins  only
380       cause a warning.
381
382       If  a fully-qualified path is not specified for a plugin, then the cur‐
383       rently configured PluginDir in slurm.conf(5) is searched.
384
385       SPANK plugins are stackable, meaning that more than one plugin  may  be
386       placed  into  the  config  file.  The  plugins will simply be called in
387       order, one after the other, and appropriate  action  taken  on  failure
388       given that state of the plugin's optional flag.
389
390       Additional  config files or directories of config files may be included
391       in plugstack.conf with the include keyword. The  include  keyword  must
392       appear  on its own line, and takes a glob as its parameter, so multiple
393       files may be included from one include line. For example, the following
394       syntax  will  load  all config files in the /etc/slurm/plugstack.conf.d
395       directory, in local collation order:
396
397         include /etc/slurm/plugstack.conf.d/*
398
399       which might be considered a more flexible  method  for  building  up  a
400       spank plugin stack.
401
402       The  SPANK  config  file  is re-read on each job launch, so editing the
403       config file will not affect running jobs. However care should be  taken
404       so that a partially edited config file is not read by a launching job.
405
406

EXAMPLES

408       Simple SPANK config file:
409
410       #
411       # SPANK config file
412       #
413       # required?       plugin                     args
414       #
415       optional          renice.so                  min_prio=-10
416       required          /usr/lib/slurm/test.so
417
418
419       The  following is a simple SPANK plugin to modify the nice value of job
420       tasks. This plugin adds a --renice=[prio] option to  srun  which  users
421       can  use  to set the priority of all remote tasks. Priority may also be
422       specified via a SLURM_RENICE environment variable. A  minimum  priority
423       may  be  established  via a "min_prio" parameter in plugstack.conf (See
424       above for example).
425
426       /*
427        *   To compile:
428        *    gcc -shared -o renice.so renice.c
429        *
430        */
431       #include <sys/types.h>
432       #include <stdio.h>
433       #include <stdlib.h>
434       #include <unistd.h>
435       #include <string.h>
436       #include <sys/resource.h>
437
438       #include <slurm/spank.h>
439
440       /*
441        * All spank plugins must define this macro for the
442        * Slurm plugin loader.
443        */
444       SPANK_PLUGIN(renice, 1);
445
446       #define PRIO_ENV_VAR "SLURM_RENICE"
447       #define PRIO_NOT_SET 42
448
449       /*
450        * Minimum allowable value for priority. May be
451        * set globally via plugin option min_prio=<prio>
452        */
453       static int min_prio = -20;
454
455       static int prio = PRIO_NOT_SET;
456
457       static int _renice_opt_process (int val,
458                                       const char *optarg,
459                                       int remote);
460       static int _str2prio (const char *str, int *p2int);
461
462       /*
463        *  Provide a --renice=[prio] option to srun:
464        */
465       struct spank_option spank_options[] =
466       {
467           { "renice", "[prio]",
468             "Re-nice job tasks to priority [prio].", 2, 0,
469             (spank_opt_cb_f) _renice_opt_process
470           },
471           SPANK_OPTIONS_TABLE_END
472       };
473
474       /*
475        *  Called from both srun and slurmd.
476        */
477       int slurm_spank_init (spank_t sp, int ac, char **av)
478       {
479           int i;
480
481           /* Don't do anything in sbatch/salloc */
482           if (spank_context () == S_CTX_ALLOCATOR)
483               return (0);
484
485           for (i = 0; i < ac; i++) {
486               if (strncmp ("min_prio=", av[i], 9) == 0) {
487                   const char *optarg = av[i] + 9;
488                   if (_str2prio (optarg, &min_prio) < 0)
489                       slurm_error ("Ignoring invalid min_prio value: %s",
490                                    av[i]);
491               } else {
492                   slurm_error ("renice: Invalid option: %s", av[i]);
493               }
494           }
495
496           if (!spank_remote (sp))
497               slurm_verbose ("renice: min_prio = %d", min_prio);
498
499           return (0);
500       }
501
502
503       int slurm_spank_task_post_fork (spank_t sp, int ac, char **av)
504       {
505           pid_t pid;
506           int taskid;
507
508           if (prio == PRIO_NOT_SET) {
509               /* See if SLURM_RENICE env var is set by user */
510               char val [1024];
511
512               if (spank_getenv (sp, PRIO_ENV_VAR, val, 1024)
513                   != ESPANK_SUCCESS)
514                   return (0);
515
516               if (_str2prio (val, &prio) < 0) {
517                   slurm_error ("Bad value for %s: %s",
518                                PRIO_ENV_VAR, optarg);
519                   return (-1);
520               }
521
522               if (prio < min_prio) {
523                   slurm_error ("%s=%d not allowed, using min=%d",
524                                PRIO_ENV_VAR, prio, min_prio);
525               }
526           }
527
528           if (prio < min_prio)
529               prio = min_prio;
530
531           spank_get_item (sp, S_TASK_GLOBAL_ID, &taskid);
532           spank_get_item (sp, S_TASK_PID, &pid);
533
534           slurm_info ("re-nicing task%d pid %ld to %ld",
535                       taskid, pid, prio);
536
537           if (setpriority (PRIO_PROCESS, (int) pid,
538                            (int) prio) < 0) {
539               slurm_error ("setpriority: %m");
540               return (-1);
541           }
542
543           return (0);
544       }
545
546       static int _str2prio (const char *str, int *p2int)
547       {
548           long int l;
549           char *p;
550
551           l = strtol (str, &p, 10);
552           if ((*p != ' ') || (l < -20) || (l > 20))
553               return (-1);
554
555           *p2int = (int) l;
556
557           return (0);
558       }
559
560       static int _renice_opt_process (int val,
561                                       const char *optarg,
562                                       int remote)
563       {
564           if (optarg == NULL) {
565               slurm_error ("renice: invalid argument!");
566               return (-1);
567           }
568
569           if (_str2prio (optarg, &prio) < 0) {
570               slurm_error ("Bad value for --renice: %s",
571                            optarg);
572               return (-1);
573           }
574
575           if (prio < min_prio) {
576               slurm_error ("--renice=%d not allowed, will use min=%d",
577                            prio, min_prio);
578           }
579
580           return (0);
581       }
582
583
584

COPYING

586       Portions copyright (C) 2010-2018 SchedMD LLC.  Copyright (C)  2006  The
587       Regents  of  the University of California.  Produced at Lawrence Liver‐
588       more  National  Laboratory  (cf,  DISCLAIMER).   CODE-OCEC-09-009.  All
589       rights reserved.
590
591       This  file  is  part  of  Slurm,  a  resource  management program.  For
592       details, see <https://slurm.schedmd.com/>.
593
594       Slurm is free software; you can redistribute it and/or modify it  under
595       the  terms  of  the GNU General Public License as published by the Free
596       Software Foundation; either version 2  of  the  License,  or  (at  your
597       option) any later version.
598
599       Slurm  is  distributed  in the hope that it will be useful, but WITHOUT
600       ANY WARRANTY; without even the implied warranty of  MERCHANTABILITY  or
601       FITNESS  FOR  A PARTICULAR PURPOSE.  See the GNU General Public License
602       for more details.
603

FILES

605       /etc/slurm/slurm.conf - Slurm configuration file.
606       /etc/slurm/plugstack.conf - SPANK configuration file.
607       /usr/include/slurm/spank.h - SPANK header file.
608

SEE ALSO

610       srun(1), slurm.conf(5)
611
612
613
614April 2020                      Slurm Component                       SPANK(8)
Impressum