1SPANK(8)                        Slurm Component                       SPANK(8)
2
3
4

NAME

6       SPANK - Slurm Plug-in Architecture for Node and job (K)control
7
8

DESCRIPTION

10       This  manual  briefly  describes  the capabilities of the Slurm Plug-in
11       architecture for Node and job Kontrol (SPANK) as well as the SPANK con‐
12       figuration file: (By default: plugstack.conf.)
13
14       SPANK  provides  a  very generic interface for stackable plug-ins which
15       may be used to dynamically modify the job launch code in  Slurm.  SPANK
16       plugins  may  be  built  without access to Slurm source code. They need
17       only be compiled against Slurm's spank.h  header  file,  added  to  the
18       SPANK  config  file  plugstack.conf, and they will be loaded at runtime
19       during the next job launch. Thus,  the  SPANK  infrastructure  provides
20       administrators  and  other developers a low cost, low effort ability to
21       dynamically modify the runtime behavior of Slurm job launch.
22
23       Note: SPANK plugins using the Slurm APIs need  to  be  recompiled  when
24       upgrading Slurm to a new major release.
25

SPANK PLUGINS

27       SPANK plugins are loaded in up to five separate contexts during a Slurm
28       job. Briefly, the five contexts are:
29
30       local   In local context, the plugin  is  loaded  by  srun.  (i.e.  the
31               "local" part of a parallel job).
32
33       remote  In  remote  context,  the plugin is loaded by slurmstepd. (i.e.
34               the "remote" part of a parallel job).
35
36       allocator
37               In allocator context, the plugin is loaded in one  of  the  job
38               allocation utilities sbatch or salloc.
39
40       slurmd In slurmd context, the plugin is loaded in the
41              slurmd  daemon  itself.  Note:  Plugins loaded in slurmd context
42              persist for the entire time slurmd is running, so if  configura‐
43              tion is changed or plugins are updated, slurmd must be restarted
44              for the changes to take effect.
45
46       job_script
47              In the job_script context, plugins are loaded in the context  of
48              the job prolog or epilog. Note: Plugins are loaded in job_script
49              context on each run on the job prolog or epilog, in  a  separate
50              address  space  from plugins in slurmd context. This means there
51              is no state shared between this context and other  contexts,  or
52              even    between    one   call   to   slurm_spank_job_prolog   or
53              slurm_spank_job_epilog and subsequent calls.
54
55       In  local  context,   only   the   init,   exit,   init_post_opt,   and
56       local_user_init  functions  are  called. In allocator context, only the
57       init, exit, and init_post_opt  functions  are  called.   Similarly,  in
58       slurmd  context,  only  the  slurmd_init  and slurmd_exit callbacks are
59       active, and in the job_script context, only the job_prolog and job_epi‐
60       log  callbacks  are  used.  Plugins may query the context in which they
61       are running with the spank_context and spank_remote  functions  defined
62       in <slurm/spank.h>.
63
64       SPANK  plugins  may be called from multiple points during the Slurm job
65       launch. A plugin may define the following functions:
66
67       slurm_spank_init
68         Called just after plugins are loaded. In remote context, this is just
69         after  job  step  is  initialized. This function is called before any
70         plugin option processing. This function is not called in slurmd  con‐
71         text.
72
73       slurm_spank_slurmd_init
74         Called in slurmd just after the daemon is started.
75
76       slurm_spank_job_prolog
77         Called at the same time as the job prolog. If this function returns a
78         negative value and the SPANK plugin that contains it is  required  in
79         the plugstack.conf, the node that this is run on will be drained.
80
81
82       slurm_spank_init_post_opt
83         Called  at  the  same  point  as slurm_spank_init, but after all user
84         options to the plugin have been processed. The reason that  the  init
85         and  init_post_opt  callbacks  are  separated  is so that plugins can
86         process system-wide options specified in plugstack.conf in  the  init
87         callback,  then process user options, and finally take some action in
88         slurm_spank_init_post_opt if necessary.  In the case of  a  heteroge‐
89         neous job, slurm_spank_init is invoked once per job component.
90
91       slurm_spank_local_user_init
92         Called  in local (srun) context only after all options have been pro‐
93         cessed.  This is called after the job ID and step IDs are  available.
94         This  happens  in srun after the allocation is made, but before tasks
95         are launched.
96
97       slurm_spank_user_init
98         Called after privileges  are  temporarily  dropped.  (remote  context
99         only)
100
101       slurm_spank_task_init_privileged
102         Called  for each task just after fork, but before all elevated privi‐
103         leges are dropped. (remote context only)
104
105       slurm_spank_task_init
106         Called for each task just before execve (2). (remote context only)
107
108       slurm_spank_task_post_fork
109         Called for each task from parent process after fork (2) is  complete.
110         Due  to  the fact that slurmd does not exec any tasks until all tasks
111         have completed fork (2), this call is guaranteed to  run  before  the
112         user task is executed. (remote context only)
113
114       slurm_spank_task_exit
115         Called  for  each  task  as  its  exit  status is collected by Slurm.
116         (remote context only)
117
118       slurm_spank_exit
119         Called once just before slurmstepd exits in remote context.  In local
120         context, called before srun exits.
121
122       slurm_spank_job_epilog
123         Called at the same time as the job epilog. If this function returns a
124         negative value and the SPANK plugin that contains it is  required  in
125         the plugstack.conf, the node that this is run on will be drained.
126
127       slurm_spank_slurmd_exit
128         Called in slurmd when the daemon is shut down.
129
130       All of these functions have the same prototype, for example:
131
132          int slurm_spank_init (spank_t spank, int ac, char *argv[])
133
134
135       Where spank is the SPANK handle which must be passed back to Slurm when
136       the plugin calls functions like spank_get_item and  spank_getenv.  Con‐
137       figured  arguments (See CONFIGURATION below) are passed in the argument
138       vector argv with argument count ac.
139
140       SPANK plugins can query the current list of supported slurm_spank  sym‐
141       bols  to determine if the current version supports a given plugin hook.
142       This may be useful because the list of plugin symbols may grow  in  the
143       future.  The  query  is done using the spank_symbol_supported function,
144       which has the following prototype:
145
146           int spank_symbol_supported (const char *sym);
147
148
149       The return value is 1 if the symbol is supported, 0 if not.
150
151       SPANK plugins do not have direct access  to  internally  defined  Slurm
152       data structures. Instead, information about the currently executing job
153       is obtained via the spank_get_item function call.
154
155         spank_err_t spank_get_item (spank_t spank, spank_item_t item, ...);
156
157       The spank_get_item call must be passed the current SPANK handle as well
158       as  the  item requested, which is defined by the passed spank_item_t. A
159       variable number of pointer arguments  are  also  passed,  depending  on
160       which  item was requested by the plugin. A list of the valid values for
161       item is kept in the spank.h header file. Some examples are:
162
163       S_JOB_UID
164         User id for running job. (uid_t *) is third arg of spank_get_item
165
166       S_JOB_STEPID
167         Job  step  id  for  running  job.  (uint32_t  *)  is  third  arg   of
168         spank_get_item.
169
170       S_TASK_EXIT_STATUS
171         Exit  status  for exited task. Only valid from slurm_spank_task_exit.
172         (int *) is third arg of spank_get_item.
173
174       S_JOB_ARGV
175         Complete job command line. Third and fourth  args  to  spank_get_item
176         are (int *, char ***).
177
178       See  spank.h  for  more  details,  and EXAMPLES below for an example of
179       spank_get_item usage.
180
181       SPANK functions in the local and allocator environment should  use  the
182       getenv,  setenv,  and  unsetenv  functions to view and modify the job's
183       environment.  SPANK functions in the remote environment should use  the
184       spank_getenv,  spank_setenv,  and  spank_unsetenv functions to view and
185       modify the job's environment. spank_getenv searches the job's  environ‐
186       ment for the environment variable var and copies the current value into
187       a buffer buf of length len.  spank_setenv allows a SPANK plugin to  set
188       or  overwrite  a  variable in the job's environment, and spank_unsetenv
189       unsets an environment variable in the job's environment. The prototypes
190       are:
191
192        spank_err_t spank_getenv (spank_t spank, const char *var,
193                            char *buf, int len);
194        spank_err_t spank_setenv (spank_t spank, const char *var,
195                            const char *val, int overwrite);
196        spank_err_t spank_unsetenv (spank_t spank, const char *var);
197
198       These  are  only necessary in remote context since modifications of the
199       standard process environment using setenv (3), getenv (3), and unsetenv
200       (3) may be used in local context.
201
202       Functions are also available from within the SPANK plugins to establish
203       environment variables to be exported to the Slurm PrologSlurmctld, Pro‐
204       log,  Epilog  and  EpilogSlurmctld  programs (the so-called job control
205       environment).  The name of environment variables established  by  these
206       calls  will  be  prepended with the string SPANK_ in order to avoid any
207       security implications of arbitrary environment variable control. (After
208       all, the job control scripts do run as root or the Slurm user.).
209
210       These functions are available from local context only.
211
212         spank_err_t spank_job_control_getenv(spank_t spank, const char *var,
213                              char *buf, int len);
214         spank_err_t spank_job_control_setenv(spank_t spank, const char *var,
215                              const char *val, int overwrite);
216         spank_err_t spank_job_control_unsetenv(spank_t spank, const char *var);
217
218       See spank.h for more information, and EXAMPLES below for an example for
219       spank_getenv usage.
220
221       Many of the described  SPANK  functions  available  to  plugins  return
222       errors  via  the  spank_err_t  error type. On success, the return value
223       will be set to ESPANK_SUCCESS, while on failure, the return value  will
224       be  set to one of many error values defined in slurm/spank.h. The SPANK
225       interface provides a simple function
226
227         const char * spank_strerror(spank_err_t err);
228
229       which may be used to translate a spank_err_t value into its string rep‐
230       resentation.
231
232

SPANK OPTIONS

234       SPANK  plugins also have an interface through which they may define and
235       implement extra job options. These options are made  available  to  the
236       user  through Slurm commands such as srun(1), salloc(1), and sbatch(1).
237       If the option is specified by the user, its value is forwarded and reg‐
238       istered  with  the  plugin in slurmd when the job is run.  In this way,
239       SPANK plugins may dynamically provide new options and functionality  to
240       Slurm.
241
242       Each  option registered by a plugin to Slurm takes the form of a struct
243       spank_option which is declared in <slurm/spank.h> as
244
245          struct spank_option {
246             char *         name;
247             char *         arginfo;
248             char *         usage;
249             int            has_arg;
250             int            val;
251             spank_opt_cb_f cb;
252          };
253
254
255       Where
256
257       name   is  the  name  of  the  option.  Its  length   is   limited   to
258              SPANK_OPTION_MAXLEN defined in <slurm/spank.h>.
259
260       arginfo
261              is  a  description  of the argument to the option, if the option
262              does take an argument.
263
264       usage  is a short description of the option suitable for --help output.
265
266       has_arg
267              0 if option takes no argument, 1 if option  takes  an  argument,
268              and 2 if the option takes an optional argument. (See getopt_long
269              (3)).
270
271       val    A plugin-local value to return to the option callback function.
272
273       cb     A callback function that is invoked when the  plugin  option  is
274              registered   with   Slurm.   spank_opt_cb_f   is   typedef'd  in
275              <slurm/spank.h> as
276
277                typedef int (*spank_opt_cb_f) (int val, const char *optarg,
278                                         int remote);
279
280              Where val is the value of the  val  field  in  the  spank_option
281              struct,  optarg  is  the  supplied  argument  if applicable, and
282              remote is 0 if the function is being  called  from  the  "local"
283              host  (e.g.  host  where srun or sbatch/salloc are invoked) or 1
284              from the "remote" host (host where  slurmd/slurmstepd  run)  but
285              only  executed  by slurmstepd (remote context) if the option was
286              registered for such context.
287
288       Plugin options may be registered with Slurm using the spank_option_reg‐
289       ister  function. This function is only valid when called from the plug‐
290       in's slurm_spank_init handler, and registers one option at a time.  The
291       prototype is
292
293          spank_err_t spank_option_register (spank_t sp,
294                    struct spank_option *opt);
295
296       This  function will return ESPANK_SUCCESS on successful registration of
297       an option, or ESPANK_BAD_ARG for errors including invalid spank_t  han‐
298       dle, or when the function is not called from the slurm_spank_init func‐
299       tion. All options need to be registered from all contexts in which they
300       will  be  used. For instance, if an option is only used in local (srun)
301       and remote (slurmd) contexts, then spank_option_register should only be
302       called from within those contexts. For example:
303
304          if (spank_context() != S_CTX_ALLOCATOR)
305             spank_option_register (sp, opt);
306
307       If,  however, the option is used in all contexts, the spank_option_reg‐
308       ister needs to be called everywhere.
309
310       In addition to spank_option_register, plugins may also  export  options
311       to  Slurm  by  defining  a table of struct spank_option with the symbol
312       name spank_options. This method, however, is not supported for use with
313       sbatch    and   salloc   (allocator   context),   thus   the   use   of
314       spank_option_register is preferred. When using the spank_options table,
315       the   final  element  in  the  array  must  be  filled  with  zeros.  A
316       SPANK_OPTIONS_TABLE_END macro is provided in <slurm/spank.h>  for  this
317       purpose.
318
319       When  an  option  is  provided by the user on the local side, either by
320       command line options or by environment variables,  Slurm  will  immedi‐
321       ately invoke the option's callback with remote=0. This is meant for the
322       plugin to do local sanity checking of the option before  the  value  is
323       sent  to  the  remote  side during job launch. If the argument the user
324       specified is invalid, the plugin should issue  an  error  and  issue  a
325       non-zero  return  code  from the callback. The plugin should be able to
326       handle cases where the spank option is set multiple times through envi‐
327       ronment  variables  and command line options. Environment variables are
328       processed before command line options.
329
330       On the remote side, options and their  arguments  are  registered  just
331       after  SPANK  plugins  are  loaded and before the spank_init handler is
332       called. This allows plugins to modify behavior of all plugin  function‐
333       ality based on the value of user-provided options.  (See EXAMPLES below
334       for a plugin that registers an option with Slurm).
335
336       As an alternative to use of an option  callback  and  global  variable,
337       plugins  can  use  the spank_option_getopt option to check for supplied
338       options after option processing. This function has the prototype:
339
340          spank_err_t spank_option_getopt(spank_t sp,
341              struct spank_option *opt, char **optargp);
342
343       This function returns ESPANK_SUCCESS if the option defined in the
344       struct spank_option opt has been used by the user. If optargp
345       is non-NULL then it is set to any option argument passed (if the option
346       takes an argument). The use of this method is required to process
347       options in job_script context (slurm_spank_job_prolog and
348       slurm_spank_job_epilog). This function is valid in the following contexts:
349       slurm_spank_job_prolog, slurm_spank_local_user_init, slurm_spank_user_init,
350       slurm_spank_task_init_privileged, slurm_spank_task_init, slurm_spank_task_exit,
351       and slurm_spank_job_epilog.
352
353

CONFIGURATION

355       The default SPANK plug-in stack configuration file is plugstack.conf in
356       the same directory as slurm.conf(5), though this may be changed via the
357       Slurm config parameter PlugStackConfig.   Normally  the  plugstack.conf
358       file  should be identical on all nodes of the cluster.  The config file
359       lists SPANK plugins, one per line, along with  whether  the  plugin  is
360       required or optional, and any global arguments that are to be passed to
361       the plugin for runtime configuration.  Comments are preceded  with  '#'
362       and  extend to the end of the line.  If the configuration file is miss‐
363       ing or empty, it will simply be ignored.
364
365       The format of each non-comment line in the configuration file is:
366
367         required/optional   plugin   arguments
368
369        For example:
370
371         optional /usr/lib/slurm/test.so
372
373       Tells slurmd to load the plugin test.so passing  no  arguments.   If  a
374       SPANK plugin is required, then failure of any of the plugin's functions
375       will cause slurmd to terminate the job,  while  optional  plugins  only
376       cause a warning.
377
378       If  a fully-qualified path is not specified for a plugin, then the cur‐
379       rently configured PluginDir in slurm.conf(5) is searched.
380
381       SPANK plugins are stackable, meaning that more than one plugin  may  be
382       placed  into  the  config  file.  The  plugins will simply be called in
383       order, one after the other, and appropriate  action  taken  on  failure
384       given that state of the plugin's optional flag.
385
386       Additional  config files or directories of config files may be included
387       in plugstack.conf with the include keyword. The  include  keyword  must
388       appear  on its own line, and takes a glob as its parameter, so multiple
389       files may be included from one include line. For example, the following
390       syntax  will  load  all config files in the /etc/slurm/plugstack.conf.d
391       directory, in local collation order:
392
393         include /etc/slurm/plugstack.conf.d/*
394
395       which might be considered a more flexible  method  for  building  up  a
396       spank plugin stack.
397
398       The  SPANK  config  file  is re-read on each job launch, so editing the
399       config file will not affect running jobs. However care should be  taken
400       so that a partially edited config file is not read by a launching job.
401
402

EXAMPLES

404       Simple SPANK config file:
405
406       #
407       # SPANK config file
408       #
409       # required?       plugin                     args
410       #
411       optional          renice.so                  min_prio=-10
412       required          /usr/lib/slurm/test.so
413
414
415       The  following is a simple SPANK plugin to modify the nice value of job
416       tasks. This plugin adds a --renice=[prio] option to  srun  which  users
417       can  use  to set the priority of all remote tasks. Priority may also be
418       specified via a SLURM_RENICE environment variable. A  minimum  priority
419       may  be  established  via a "min_prio" parameter in plugstack.conf (See
420       above for example).
421
422       /*
423        *   To compile:
424        *    gcc -shared -o renice.so renice.c
425        *
426        */
427       #include <sys/types.h>
428       #include <stdio.h>
429       #include <stdlib.h>
430       #include <unistd.h>
431       #include <string.h>
432       #include <sys/resource.h>
433
434       #include <slurm/spank.h>
435
436       /*
437        * All spank plugins must define this macro for the
438        * Slurm plugin loader.
439        */
440       SPANK_PLUGIN(renice, 1);
441
442       #define PRIO_ENV_VAR "SLURM_RENICE"
443       #define PRIO_NOT_SET 42
444
445       /*
446        * Minimum allowable value for priority. May be
447        * set globally via plugin option min_prio=<prio>
448        */
449       static int min_prio = -20;
450
451       static int prio = PRIO_NOT_SET;
452
453       static int _renice_opt_process (int val,
454                                       const char *optarg,
455                                       int remote);
456       static int _str2prio (const char *str, int *p2int);
457
458       /*
459        *  Provide a --renice=[prio] option to srun:
460        */
461       struct spank_option spank_options[] =
462       {
463           { "renice", "[prio]",
464             "Re-nice job tasks to priority [prio].", 2, 0,
465             (spank_opt_cb_f) _renice_opt_process
466           },
467           SPANK_OPTIONS_TABLE_END
468       };
469
470       /*
471        *  Called from both srun and slurmd.
472        */
473       int slurm_spank_init (spank_t sp, int ac, char **av)
474       {
475           int i;
476
477           /* Don't do anything in sbatch/salloc */
478           if (spank_context () == S_CTX_ALLOCATOR)
479               return (0);
480
481           for (i = 0; i < ac; i++) {
482               if (strncmp ("min_prio=", av[i], 9) == 0) {
483                   const char *optarg = av[i] + 9;
484                   if (_str2prio (optarg, &min_prio) < 0)
485                       slurm_error ("Ignoring invalid min_prio value: %s",
486                                    av[i]);
487               } else {
488                   slurm_error ("renice: Invalid option: %s", av[i]);
489               }
490           }
491
492           if (!spank_remote (sp))
493               slurm_verbose ("renice: min_prio = %d", min_prio);
494
495           return (0);
496       }
497
498
499       int slurm_spank_task_post_fork (spank_t sp, int ac, char **av)
500       {
501           pid_t pid;
502           int taskid;
503
504           if (prio == PRIO_NOT_SET) {
505               /* See if SLURM_RENICE env var is set by user */
506               char val [1024];
507
508               if (spank_getenv (sp, PRIO_ENV_VAR, val, 1024)
509                   != ESPANK_SUCCESS)
510                   return (0);
511
512               if (_str2prio (val, &prio) < 0) {
513                   slurm_error ("Bad value for %s: %s",
514                                PRIO_ENV_VAR, optarg);
515                   return (-1);
516               }
517
518               if (prio < min_prio) {
519                   slurm_error ("%s=%d not allowed, using min=%d",
520                                PRIO_ENV_VAR, prio, min_prio);
521               }
522           }
523
524           if (prio < min_prio)
525               prio = min_prio;
526
527           spank_get_item (sp, S_TASK_GLOBAL_ID, &taskid);
528           spank_get_item (sp, S_TASK_PID, &pid);
529
530           slurm_info ("re-nicing task%d pid %ld to %ld",
531                       taskid, pid, prio);
532
533           if (setpriority (PRIO_PROCESS, (int) pid,
534                            (int) prio) < 0) {
535               slurm_error ("setpriority: %m");
536               return (-1);
537           }
538
539           return (0);
540       }
541
542       static int _str2prio (const char *str, int *p2int)
543       {
544           long int l;
545           char *p;
546
547           l = strtol (str, &p, 10);
548           if ((*p != ' ') || (l < -20) || (l > 20))
549               return (-1);
550
551           *p2int = (int) l;
552
553           return (0);
554       }
555
556       static int _renice_opt_process (int val,
557                                       const char *optarg,
558                                       int remote)
559       {
560           if (optarg == NULL) {
561               slurm_error ("renice: invalid argument!");
562               return (-1);
563           }
564
565           if (_str2prio (optarg, &prio) < 0) {
566               slurm_error ("Bad value for --renice: %s",
567                            optarg);
568               return (-1);
569           }
570
571           if (prio < min_prio) {
572               slurm_error ("--renice=%d not allowed, will use min=%d",
573                            prio, min_prio);
574           }
575
576           return (0);
577       }
578
579
580

COPYING

582       Portions copyright (C) 2010-2018 SchedMD LLC.  Copyright (C)  2006  The
583       Regents  of  the University of California.  Produced at Lawrence Liver‐
584       more  National  Laboratory  (cf,  DISCLAIMER).   CODE-OCEC-09-009.  All
585       rights reserved.
586
587       This  file  is  part  of  Slurm,  a  resource  management program.  For
588       details, see <https://slurm.schedmd.com/>.
589
590       Slurm is free software; you can redistribute it and/or modify it  under
591       the  terms  of  the GNU General Public License as published by the Free
592       Software Foundation; either version 2  of  the  License,  or  (at  your
593       option) any later version.
594
595       Slurm  is  distributed  in the hope that it will be useful, but WITHOUT
596       ANY WARRANTY; without even the implied warranty of  MERCHANTABILITY  or
597       FITNESS  FOR  A PARTICULAR PURPOSE.  See the GNU General Public License
598       for more details.
599

FILES

601       /etc/slurm/slurm.conf - Slurm configuration file.
602       /etc/slurm/plugstack.conf - SPANK configuration file.
603       /usr/include/slurm/spank.h - SPANK header file.
604

SEE ALSO

606       srun(1), slurm.conf(5)
607
608
609
610August 2017                     Slurm Component                       SPANK(8)
Impressum