1SPANK(8)                        Slurm Component                       SPANK(8)
2
3
4

NAME

6       SPANK - Slurm Plug-in Architecture for Node and job (K)control
7
8

DESCRIPTION

10       This manual briefly describes the capabilities of the Slurm Plug-in Ar‐
11       chitecture for Node and job Kontrol (SPANK) as well as the  SPANK  con‐
12       figuration file: (By default: plugstack.conf.)
13
14       SPANK  provides  a  very generic interface for stackable plug-ins which
15       may be used to dynamically modify the job launch code in  Slurm.  SPANK
16       plugins  may  be  built  without access to Slurm source code. They need
17       only be compiled against Slurm's spank.h  header  file,  added  to  the
18       SPANK  config  file  plugstack.conf, and they will be loaded at runtime
19       during the next job launch. Thus, the SPANK infrastructure provides ad‐
20       ministrators and other developers a low cost, low effort ability to dy‐
21       namically modify the runtime behavior of Slurm job launch.
22
23       NOTE: All SPANK plugins should be recompiled when upgrading Slurm to  a
24       new major release. The SPANK API is not guaranteed to be ABI compatible
25       between major releases. Any SPANK plugin linking to any  of  the  Slurm
26       libraries should be carefully checked as the Slurm APIs and headers can
27       change between major releases.
28

SPANK PLUGINS

30       SPANK plugins are loaded in up to five separate contexts during a Slurm
31       job. Briefly, the five contexts are:
32
33
34       local   In  local context, the plugin is loaded by srun. (i.e. the "lo‐
35               cal" part of a parallel job).
36
37       remote  In remote context, the plugin is loaded  by  slurmstepd.  (i.e.
38               the "remote" part of a parallel job).
39
40       allocator
41               In  allocator  context,  the plugin is loaded in one of the job
42               allocation utilities salloc, sbatch or scrontab.
43
44       slurmd  In slurmd context, the plugin is loaded in  the  slurmd  daemon
45               itself.  Note: Plugins loaded in slurmd context persist for the
46               entire time slurmd is running, so if configuration  is  changed
47               or  plugins  are  updated,  slurmd  must  be  restarted for the
48               changes to take effect.
49
50       job_script
51               In the job_script context, plugins are loaded in the context of
52               the   job  prolog  or  epilog.  Note:  Plugins  are  loaded  in
53               job_script context on each run on the job prolog or epilog,  in
54               a  separate  address space from plugins in slurmd context. This
55               means there is no state shared between this context  and  other
56               contexts, or even between one call to slurm_spank_job_prolog or
57               slurm_spank_job_epilog and subsequent calls.
58
59       In  local  context,  only  the  init,  exit,  init_post_opt,  and   lo‐
60       cal_user_init  functions  are  called.  In  allocator context, only the
61       init, exit, and init_post_opt  functions  are  called.   Similarly,  in
62       slurmd context, only the init and slurmd_exit callbacks are active, and
63       in the job_script context, only the job_prolog and job_epilog callbacks
64       are used.  Plugins may query the context in which they are running with
65       the   spank_context   and    spank_remote    functions    defined    in
66       <slurm/spank.h>.
67
68       SPANK  plugins  may be called from multiple points during the Slurm job
69       launch. A plugin may define the following functions:
70
71
72       slurm_spank_init
73         Called just after plugins are loaded. In remote context, this is just
74         after  job  step  is  initialized. This function is called before any
75         plugin option processing.
76
77       slurm_spank_job_prolog
78         Called at the same time as the job prolog. If this function returns a
79         non-zero  value  and the SPANK plugin that contains it is required in
80         the plugstack.conf, the node that this is run on will be drained.
81
82       slurm_spank_init_post_opt
83         Called at the same point as slurm_spank_init, but after all user  op‐
84         tions to the plugin have been processed. The reason that the init and
85         init_post_opt callbacks are separated is so that plugins can  process
86         system-wide options specified in plugstack.conf in the init callback,
87         then  process  user  options,  and  finally  take  some   action   in
88         slurm_spank_init_post_opt  if  necessary.  In the case of a heteroge‐
89         neous job, slurm_spank_init is invoked once per job component.
90
91       slurm_spank_local_user_init
92         Called in local (srun) context only after all options have been  pro‐
93         cessed.   This is called after the job ID and step IDs are available.
94         This happens in srun after the allocation is made, but  before  tasks
95         are launched.
96
97       slurm_spank_user_init
98         Called  after  privileges  are  temporarily  dropped. (remote context
99         only)
100
101       slurm_spank_task_init_privileged
102         Called for each task just after fork, but before all elevated  privi‐
103         leges are dropped. (remote context only)
104
105       slurm_spank_task_init
106         Called  for  each  task just before execve (2). If you are restricing
107         memory with cgroups, memory allocated  here  will  be  in  the  job's
108         cgroup. (remote context only)
109
110       slurm_spank_task_post_fork
111         Called  for each task from parent process after fork (2) is complete.
112         Due to the fact that slurmd does not exec any tasks until  all  tasks
113         have  completed  fork  (2), this call is guaranteed to run before the
114         user task is executed. (remote context only)
115
116       slurm_spank_task_exit
117         Called for each task as its exit status is collected by Slurm.   (re‐
118         mote context only)
119
120       slurm_spank_exit
121         Called once just before slurmstepd exits in remote context.  In local
122         context, called before srun exits.
123
124       slurm_spank_job_epilog
125         Called at the same time as the job epilog. If this function returns a
126         non-zero  value  and the SPANK plugin that contains it is required in
127         the plugstack.conf, the node that this is run on will be drained.
128
129       slurm_spank_slurmd_exit
130         Called in slurmd when the daemon is shut down.
131
132       All of these functions have the same prototype, for example:
133          int slurm_spank_init (spank_t spank, int ac, char *argv[])
134
135
136       Where spank is the SPANK handle which must be passed back to Slurm when
137       the  plugin  calls functions like spank_get_item and spank_getenv. Con‐
138       figured arguments (See CONFIGURATION below) are passed in the  argument
139       vector argv with argument count ac.
140
141       SPANK  plugins can query the current list of supported slurm_spank sym‐
142       bols to determine if the current version supports a given plugin  hook.
143       This  may  be useful because the list of plugin symbols may grow in the
144       future. The query is done using  the  spank_symbol_supported  function,
145       which has the following prototype:
146           int spank_symbol_supported (const char *sym);
147
148
149       The return value is 1 if the symbol is supported, 0 if not.
150
151       SPANK  plugins  do  not  have direct access to internally defined Slurm
152       data structures. Instead, information about the currently executing job
153       is obtained via the spank_get_item function call.
154         spank_err_t spank_get_item (spank_t spank, spank_item_t item, ...);
155
156       The spank_get_item call must be passed the current SPANK handle as well
157       as the item requested, which is defined by the passed  spank_item_t.  A
158       variable  number  of  pointer  arguments  are also passed, depending on
159       which item was requested by the plugin. A list of the valid values  for
160       item is kept in the spank.h header file. Some examples are:
161
162
163       S_JOB_UID
164         User id for running job. (uid_t *) is third arg of spank_get_item
165
166       S_JOB_STEPID
167         Job   step  id  for  running  job.  (uint32_t  *)  is  third  arg  of
168         spank_get_item.
169
170       S_TASK_EXIT_STATUS
171         Exit status for exited task. Only valid  from  slurm_spank_task_exit.
172         (int *) is third arg of spank_get_item.
173
174       S_JOB_ARGV
175         Complete  job  command  line. Third and fourth args to spank_get_item
176         are (int *, char ***).
177
178       See spank.h for more details, and EXAMPLES  below  for  an  example  of
179       spank_get_item usage.
180
181       SPANK  functions  in the local and allocator environment should use the
182       getenv, setenv, and unsetenv functions to view and modify the job's en‐
183       vironment.   SPANK  functions  in the remote environment should use the
184       spank_getenv, spank_setenv, and spank_unsetenv functions  to  view  and
185       modify  the job's environment. spank_getenv searches the job's environ‐
186       ment for the environment variable var and copies the current value into
187       a  buffer buf of length len.  spank_setenv allows a SPANK plugin to set
188       or overwrite a variable in the job's  environment,  and  spank_unsetenv
189       unsets an environment variable in the job's environment. The prototypes
190       are:
191        spank_err_t spank_getenv (spank_t spank, const char *var,
192                            char *buf, int len);
193        spank_err_t spank_setenv (spank_t spank, const char *var,
194                            const char *val, int overwrite);
195        spank_err_t spank_unsetenv (spank_t spank, const char *var);
196
197
198       These are only necessary in remote context since modifications  of  the
199       standard process environment using setenv (3), getenv (3), and unsetenv
200       (3) may be used in local context.
201
202       Functions are also available from within the SPANK plugins to establish
203       environment variables to be exported to the Slurm PrologSlurmctld, Pro‐
204       log, Epilog and EpilogSlurmctld programs (the so-called job control en‐
205       vironment).   The  name  of  environment variables established by these
206       calls will be prepended with the string SPANK_ in order  to  avoid  any
207       security implications of arbitrary environment variable control. (After
208       all, the job control scripts do run as root or the Slurm user.).
209
210       These functions are available from local context only.
211         spank_err_t spank_job_control_getenv(spank_t spank, const char *var,
212                              char *buf, int len);
213         spank_err_t spank_job_control_setenv(spank_t spank, const char *var,
214                              const char *val, int overwrite);
215         spank_err_t spank_job_control_unsetenv(spank_t spank, const char *var);
216
217
218       See spank.h for more information, and EXAMPLES below for an example for
219       spank_getenv usage.
220
221       Many  of  the described SPANK functions available to plugins return er‐
222       rors via the spank_err_t error type. On success, the return value  will
223       be  set  to  ESPANK_SUCCESS, while on failure, the return value will be
224       set to one of many error values defined in slurm/spank.h. The SPANK in‐
225       terface provides a simple function
226         const char * spank_strerror(spank_err_t err);
227       which may be used to translate a spank_err_t value into its string rep‐
228       resentation.
229
230
231       The slurm_spank_log function can be used to print messages back to  the
232       user  at  an error level.  This is to keep users from having to rely on
233       the slurm_error function, which can be confusing  because  it  prepends
234       "error:" to every message.
235
236

SPANK OPTIONS

238       SPANK  plugins also have an interface through which they may define and
239       implement extra job options. These options are made  available  to  the
240       user  through Slurm commands such as srun(1), salloc(1), and sbatch(1).
241       If the option is specified by the user, its value is forwarded and reg‐
242       istered  with  the  plugin in slurmd when the job is run.  In this way,
243       SPANK plugins may dynamically provide new options and functionality  to
244       Slurm.
245
246       Each  option registered by a plugin to Slurm takes the form of a struct
247       spank_option which is declared in <slurm/spank.h> as
248          struct spank_option {
249             char *         name;
250             char *         arginfo;
251             char *         usage;
252             int            has_arg;
253             int            val;
254             spank_opt_cb_f cb;
255          };
256
257       Where
258
259
260       name   is the name of the option. Its length is  limited  to  SPANK_OP‐
261              TION_MAXLEN defined in <slurm/spank.h>.
262
263       arginfo
264              is  a  description  of the argument to the option, if the option
265              does take an argument.
266
267       usage  is a short description of the option suitable for --help output.
268
269       has_arg
270              0 if option takes no argument, 1 if option  takes  an  argument,
271              and 2 if the option takes an optional argument. (See getopt_long
272              (3)).
273
274       val    A plugin-local value to return to the option callback function.
275
276       cb     A callback function that is invoked when the  plugin  option  is
277              registered   with   Slurm.   spank_opt_cb_f   is   typedef'd  in
278              <slurm/spank.h> as
279
280                typedef int (*spank_opt_cb_f) (int val, const char *optarg,
281                                         int remote);
282              Where val is the value of the  val  field  in  the  spank_option
283              struct,  optarg  is the supplied argument if applicable, and re‐
284              mote is 0 if the function is being called from the "local"  host
285              (e.g.  host  where  srun or sbatch/salloc are invoked) or 1 from
286              the "remote" host (host where slurmd/slurmstepd  run)  but  only
287              executed by slurmstepd (remote context) if the option was regis‐
288              tered for such context.
289
290       Plugin options may be registered with Slurm using the spank_option_reg‐
291       ister  function.  This  function  is  only  valid  when called from the
292       plugin's slurm_spank_init handler, and registers one option at a  time.
293       The prototype is
294          spank_err_t spank_option_register (spank_t sp,
295                    struct spank_option *opt);
296       This  function will return ESPANK_SUCCESS on successful registration of
297       an option, or ESPANK_BAD_ARG for errors including invalid spank_t  han‐
298       dle, or when the function is not called from the slurm_spank_init func‐
299       tion. All options need to be registered from all contexts in which they
300       will  be  used. For instance, if an option is only used in local (srun)
301       and remote (slurmd) contexts, then spank_option_register should only be
302       called from within those contexts. For example:
303          if (spank_context() != S_CTX_ALLOCATOR)
304             spank_option_register (sp, opt);
305       If,  however, the option is used in all contexts, the spank_option_reg‐
306       ister needs to be called everywhere.
307
308       In addition to spank_option_register, plugins may also  export  options
309       to  Slurm  by  defining  a table of struct spank_option with the symbol
310       name spank_options. This method, however, is not supported for use with
311       sbatch  and  salloc  (allocator  context),  thus  the  use of spank_op‐
312       tion_register is preferred. When using the spank_options table, the fi‐
313       nal element in the array must be filled with zeros. A SPANK_OPTIONS_TA‐
314       BLE_END macro is provided in <slurm/spank.h> for this purpose.
315
316       When an option is provided by the user on the  local  side,  either  by
317       command  line  options  or by environment variables, Slurm will immedi‐
318       ately invoke the option's callback with remote=0. This is meant for the
319       plugin  to  do  local sanity checking of the option before the value is
320       sent to the remote side during job launch. If  the  argument  the  user
321       specified  is  invalid,  the  plugin  should issue an error and issue a
322       non-zero return code from the callback. The plugin should  be  able  to
323       handle cases where the spank option is set multiple times through envi‐
324       ronment variables and command line options. Environment  variables  are
325       processed before command line options.
326
327       On the remote side, options and their arguments are registered just af‐
328       ter SPANK plugins are loaded  and  before  the  spank_init  handler  is
329       called.  This allows plugins to modify behavior of all plugin function‐
330       ality based on the value of user-provided options.  (See EXAMPLES below
331       for a plugin that registers an option with Slurm).
332
333       As  an  alternative  to  use of an option callback and global variable,
334       plugins can use the spank_option_getopt option to  check  for  supplied
335       options after option processing. This function has the prototype:
336          spank_err_t spank_option_getopt(spank_t sp,
337              struct spank_option *opt, char **optargp);
338       This  function  returns  ESPANK_SUCCESS  if  the  option defined in the
339       struct spank_option opt has been  used  by  the  user.  If  optargp  is
340       non-NULL  then  it  is set to any option argument passed (if the option
341       takes an argument). The use of this method is required to  process  op‐
342       tions     in    job_script    context    (slurm_spank_job_prolog    and
343       slurm_spank_job_epilog). This function is valid in the  following  con‐
344       texts:       slurm_spank_job_prolog,       slurm_spank_local_user_init,
345       slurm_spank_user_init,                slurm_spank_task_init_privileged,
346       slurm_spank_task_init,  slurm_spank_task_exit, and slurm_spank_job_epi‐
347       log.
348
349

CONFIGURATION

351       The default SPANK plug-in stack configuration file is plugstack.conf in
352       the same directory as slurm.conf(5), though this may be changed via the
353       Slurm config parameter PlugStackConfig.   Normally  the  plugstack.conf
354       file  should be identical on all nodes of the cluster.  The config file
355       lists SPANK plugins, one per line, along with whether the plugin is re‐
356       quired  or  optional, and any global arguments that are to be passed to
357       the plugin for runtime configuration.  Comments are preceded  with  '#'
358       and  extend to the end of the line.  If the configuration file is miss‐
359       ing or empty, it will simply be ignored.
360
361       The format of each non-comment line in the configuration file is:
362         required/optional   plugin   arguments
363        For example:
364         optional /usr/lib/slurm/test.so
365       Tells slurmd to load the plugin test.so passing  no  arguments.   If  a
366       SPANK plugin is required, then failure of any of the plugin's functions
367       will cause slurmd to terminate the job,  while  optional  plugins  only
368       cause a warning.
369
370       If  a fully-qualified path is not specified for a plugin, then the cur‐
371       rently configured PluginDir in slurm.conf(5) is searched.
372
373       SPANK plugins are stackable, meaning that more than one plugin  may  be
374       placed  into  the config file. The plugins will simply be called in or‐
375       der, one after the other, and appropriate action taken on failure given
376       that state of the plugin's optional flag.
377
378       Additional  config files or directories of config files may be included
379       in plugstack.conf with the include keyword. The  include  keyword  must
380       appear  on its own line, and takes a glob as its parameter, so multiple
381       files may be included from one include line. For example, the following
382       syntax  will  load  all config files in the /etc/slurm/plugstack.conf.d
383       directory, in local collation order:
384         include /etc/slurm/plugstack.conf.d/*
385       which might be considered a more flexible  method  for  building  up  a
386       spank plugin stack.
387
388       The  SPANK  config  file  is re-read on each job launch, so editing the
389       config file will not affect running jobs. However care should be  taken
390       so that a partially edited config file is not read by a launching job.
391
392

Errors

394       When  SPANK  plugin results in a non-zero result, the following changes
395       will result:
396
397
398       ┌──────┬────────────────────────────────┬─────────┬────────┬───────────┬────────┐
399       │Command│Function                         │Context   │Exitcode │Drains Node │Fails job│
400       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
401       │srun   │slurm_spank_init                 │local     │1        │no          │  yes   │
402       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
403       │srun   │slurm_spank_init_post_opt        │local     │1        │no          │  yes   │
404       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
405       │srun   │slurm_spank_local_user_init      │local     │1        │no          │  no    │
406       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
407       │srun   │slurm_spank_user_init            │remote    │0        │no          │  no    │
408       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
409       │srun   │slurm_spank_task_init_privileged │remote    │1        │no          │  yes   │
410       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
411       │srun   │slurm_spank_task_post_fork       │remote    │0        │no          │  no    │
412       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
413       │srun   │slurm_spank_task_init            │remote    │1        │no          │  yes   │
414       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
415       │srun   │slurm_spank_task_exit            │remote    │0        │no          │  no    │
416       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
417       │srun   │slurm_spank_exit                 │local     │0        │no          │  yes   │
418       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
419       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
420       │salloc │slurm_spank_init                 │allocator │1        │no          │  yes   │
421       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
422       │salloc │slurm_spank_init_post_opt        │allocator │1        │no          │  yes   │
423       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
424       │salloc │slurm_spank_init                 │local     │1        │no          │  yes   │
425       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
426       │salloc │slurm_spank_init_post_opt        │local     │1        │no          │  yes   │
427       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
428       │salloc │slurm_spank_local_user_init      │local     │1        │no          │  yes   │
429       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
430       │salloc │slurm_spank_user_init            │remote    │0        │no          │  no    │
431       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
432       │salloc │slurm_spank_task_init_privileged │remote    │1        │no          │  yes   │
433       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
434       │salloc │slurm_spank_task_post_fork       │remote    │0        │no          │  no    │
435       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
436       │salloc │slurm_spank_task_init            │remote    │1        │no          │  yes   │
437       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
438       │salloc │slurm_spank_task_exit            │remote    │0        │no          │  no    │
439       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
440       │salloc │slurm_spank_exit                 │local     │0        │no          │  yes   │
441       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
442       │salloc │slurm_spank_exit                 │allocator │0        │no          │  yes   │
443       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
444       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
445       │sbatch │slurm_spank_init                 │allocator │1        │no          │  yes   │
446       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
447       │sbatch │slurm_spank_init_post_opt        │allocator │1        │no          │  yes   │
448       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
449       │sbatch │slurm_spank_init                 │local     │1        │no          │  yes   │
450       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
451       │sbatch │slurm_spank_init_post_opt        │local     │1        │no          │  yes   │
452       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
453       │sbatch │slurm_spank_local_user_init      │local     │1        │no          │  yes   │
454       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
455       │sbatch │slurm_spank_user_init            │remote    │0        │yes         │  no    │
456       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
457       │sbatch │slurm_spank_task_init_privileged │remote    │1        │no          │  yes   │
458       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
459       │sbatch │slurm_spank_task_post_fork       │remote    │0        │yes         │  no    │
460       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
461       │sbatch │slurm_spank_task_init            │remote    │1        │no          │  yes   │
462       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
463       │sbatch │slurm_spank_task_exit            │remote    │0        │no          │  no    │
464       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
465       │sbatch │slurm_spank_exit                 │local     │0        │no          │  no    │
466       ├──────┼────────────────────────────────┼─────────┼────────┼───────────┼────────┤
467       │sbatch │slurm_spank_exit                 │allocator │0        │no          │  no    │
468       └──────┴────────────────────────────────┴─────────┴────────┴───────────┴────────┘
469       NOTE: The behavior for ProctrackType=proctrack/pgid may result in time‐
470       outs for slurm_spank_task_post_fork with remote context on failure.
471
472

EXAMPLE: renice.so

474       /etc/slurm/plugstack.conf:
475              This  example plugstack.conf file shows a configuration that ac‐
476              tivates the renice.so SPANK plugin.
477
478              #
479              # SPANK config file
480              #
481              # required?       plugin                     parameters
482              #
483              optional          /usr/lib/SPANK_renice.so   min_prio=-10
484
485
486       /usr/local/src/renice.c:
487              A sample SPANK plugin to modify the nice  value  of  job  tasks.
488              This  plugin  adds  a --renice=[prio] option to srun which users
489              can use to set the priority of all remote  tasks.  Priority  may
490              also  be  specified  via  a SLURM_RENICE environment variable. A
491              minimum priority may be established via a  "min_prio"  parameter
492              in plugstack.conf.
493
494              #include <sys/types.h>
495              #include <stdio.h>
496              #include <stdlib.h>
497              #include <unistd.h>
498              #include <string.h>
499              #include <sys/resource.h>
500
501              #include <slurm/spank.h>
502
503              /*
504               * All spank plugins must define this macro for the
505               * Slurm plugin loader.
506               */
507              SPANK_PLUGIN(renice, 1);
508
509              #define PRIO_ENV_VAR "SLURM_RENICE"
510              #define PRIO_NOT_SET -1
511
512              /*
513               * Minimum allowable value for priority. May be
514               * set globally via plugin option min_prio=<prio>
515               */
516              static int min_prio = -20;
517
518              static int prio = PRIO_NOT_SET;
519
520              static int _renice_opt_process(int val, const char *optarg, int remote);
521              static int _str2prio(const char *str, int *p2int);
522
523              /*
524               *  Provide a --renice=[prio] option to srun:
525               */
526              struct spank_option spank_options[] =
527              {
528                  {
529                      "renice",
530                      "[prio]",
531                      "Re-nice job tasks to priority [prio].",
532                      2,
533                      0,
534                      _renice_opt_process
535                  },
536                  SPANK_OPTIONS_TABLE_END
537              };
538
539              /*
540               *  Called from both srun and slurmd.
541               */
542              int slurm_spank_init(spank_t sp, int ac, char **av)
543              {
544                  int i;
545
546                  /* Don't do anything in sbatch/salloc */
547                  if (spank_context () == S_CTX_ALLOCATOR)
548                      return ESPANK_SUCCESS;
549
550                  for (i = 0; i < ac; i++) {
551                      if (!strncmp("min_prio=", av[i], 9)) {
552                          const char *optarg = av[i] + 9;
553
554                          if (_str2prio(optarg, &min_prio))
555                              slurm_error ("Ignoring invalid min_prio value: %s", av[i]);
556                      } else {
557                          slurm_error ("renice: Invalid option: %s", av[i]);
558                      }
559                  }
560
561                  if (!spank_remote(sp))
562                      slurm_verbose("renice: min_prio = %d", min_prio);
563
564                  return ESPANK_SUCCESS;
565              }
566
567              int slurm_spank_task_post_fork(spank_t sp, int ac, char **av)
568              {
569                  int rc;
570                  pid_t pid;
571                  int taskid;
572
573                  if (prio == PRIO_NOT_SET) {
574                      /* See if SLURM_RENICE env var is set by user */
575                      char val[1024];
576
577                      rc = spank_getenv(sp, PRIO_ENV_VAR, val, sizeof(val));
578
579                      if (rc)
580                          return rc;
581
582                      rc = _str2prio(val, &prio);
583
584                      if (rc) {
585                          slurm_error("Bad value for %s: %s", PRIO_ENV_VAR, optarg);
586                          return rc;
587                      }
588
589                      if (prio < min_prio) {
590                          slurm_error("%s=%d not allowed, using min=%d",
591                                      PRIO_ENV_VAR, prio, min_prio);
592                      }
593                  }
594
595                  if (prio < min_prio)
596                      prio = min_prio;
597
598                  spank_get_item(sp, S_TASK_GLOBAL_ID, &taskid);
599                  spank_get_item(sp, S_TASK_PID, &pid);
600
601                  slurm_info("re-nicing task%d pid %d to %d", taskid, (int) pid, prio);
602
603                  if (setpriority(PRIO_PROCESS, (int) pid, (int) prio)) {
604                      slurm_error("setpriority: %m");
605                      return ESPANK_ERROR;
606                  }
607
608                  return ESPANK_SUCCESS;
609              }
610
611              static int _str2prio(const char *str, int *p2int)
612              {
613                  long l;
614                  char *p = NULL;
615
616                  if (!str || str[0] == '\0')
617                      return ESPANK_BAD_ARG;
618
619                  l = strtol(str, &p, 10);
620
621                  if (!p || (*p != '\0'))
622                      return ESPANK_BAD_ARG;
623
624                  if ((l < -20) || (l > 20)) {
625                      slurm_error("Specify value between -20 and 20");
626                      return ESPANK_BAD_ARG;
627                  }
628
629                  *p2int = (int) l;
630
631                  return ESPANK_SUCCESS;
632              }
633
634              static int _renice_opt_process(int val, const char *optarg, int remote)
635              {
636                  int rc;
637
638                  if (optarg == NULL) {
639                      slurm_error("renice: invalid NULL argument!");
640                      return ESPANK_BAD_ARG;
641                  }
642
643                  if ((rc = _str2prio(optarg, &prio))) {
644                      slurm_error("Bad value for --renice: %s", optarg);
645                      return rc;
646                  }
647
648                  if (prio < min_prio) {
649                      slurm_error("--renice=%d not allowed, will use min=%d",
650                                  prio, min_prio);
651                  }
652
653                  return ESPANK_SUCCESS;
654              }
655
656
657       Compile command:
658
659              # gcc -ggdb3 -I${SLURM_PATH}/include/ -fPIC -shared -o /usr/lib/SPANK_renice.so /usr/local/src/renice.c
660
661

COPYING

663       Portions  copyright  (C) 2010-2022 SchedMD LLC.  Copyright (C) 2006 The
664       Regents of the University of California.  Produced at  Lawrence  Liver‐
665       more  National  Laboratory  (cf,  DISCLAIMER).   CODE-OCEC-09-009.  All
666       rights reserved.
667
668       This file is part of Slurm, a resource  management  program.   For  de‐
669       tails, see <https://slurm.schedmd.com/>.
670
671       Slurm  is free software; you can redistribute it and/or modify it under
672       the terms of the GNU General Public License as published  by  the  Free
673       Software  Foundation;  either version 2 of the License, or (at your op‐
674       tion) any later version.
675
676       Slurm is distributed in the hope that it will be  useful,  but  WITHOUT
677       ANY  WARRANTY;  without even the implied warranty of MERCHANTABILITY or
678       FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General  Public  License
679       for more details.
680

FILES

682       /etc/slurm/slurm.conf - Slurm configuration file.
683       /etc/slurm/plugstack.conf - SPANK configuration file.
684       /usr/include/slurm/spank.h - SPANK header file.
685

SEE ALSO

687       srun(1), slurm.conf(5)
688
689
690
691October 2021                    Slurm Component                       SPANK(8)
Impressum