1JSV(1)                     Grid Engine File Formats                     JSV(1)
2
3
4

NAME

6       JSV - Grid Engine Job Submission Verifier
7

DESCRIPTION

9       JSV  is  an abbreviation for Job Submission Verifier. A JSV is a script
10       or binary that can be used to verify, modify or reject a job during the
11       time of job submission.
12
13       JSVs  will be triggered by submit clients like qsub, qrsh, qsh and qmon
14       on submit hosts (Client JSV) or they verify incoming jobs on the master
15       host (Server JSV) or both.
16

CONFIGURATION

18       JSVs  can  be  configured on various locations. Either a jsv_url can be
19       provided with the -jsv submit parameter during job submission, a corre‐
20       sponding  switch  can  be  added  to  one of the sge_request files or a
21       jsv_url can be configured in the global cluster  configuration  of  the
22       Grid Engine installation.
23
24       All defined JSV instances will be executed in following order:
25
26          1) qsub -jsv ...
27          2) $cwd/.sge_request
28          3) $HOME/.sge_request
29          4) $SGE_ROOT/$SGE_CELL/common/sge_request
30          5) Global configuration
31
32       The  Client  JSVs (1-3) can be defined by Grid Engine end users whereas
33       the client JSV defined in the  global  sge_request  file  (4)  and  the
34       server JSV (5) can only be defined by the Grid Engine administrators.
35
36       Due  to  the  fact  that (4) and (5) are defined and configured by Grid
37       Engine administrators  and  because  they  are  executed  as  last  JSV
38       instances in the sequence of JSV scripts, an administrator has an addi‐
39       tional way to define certain policies for a cluster.
40
41       As soon as one JSV instance rejects a job the whole process of  verifi‐
42       cation  is stopped and the end user will get a corresponding error mes‐
43       sage that the submission of the job has failed.
44
45       If a JSV accepts a job or accepts a job after it applied several  modi‐
46       fications  then  the following JSV instance will get the job parameters
47       including all modifications as  input  for  the  verification  process.
48       This is done as long as either the job is accepted or rejected.
49
50       Find  more information how to use Client JSVs in qsub(1) and for Server
51       JSVs in sge_conf(5)
52

LIFETIME

54       A Client or Server JSV is started as own  UNIX  process.  This  process
55       communicates  either  with  a  Grid Engine client process or the master
56       daemon by exchanging  commands,  job  parameters  and  other  data  via
57       stdin/stdout channels.
58
59       Client JSV instances are started by client applications before a job is
60       sent to qmaster. This instance does the job verification for the job to
61       be submitted. After that verification the JSV process is stopped.
62
63       Server  JSV  instances  are  started for each worker thread part of the
64       qmaster process (for version 6.2 of Grid Engine  this  means  that  two
65       processes  are  started).  Each  of  those processes have to verify job
66       parameters for multiple jobs as long as  the  master  is  running,  the
67       underlying JSV configuration is not changed and no error occurs.
68

TIMEOUT

70       The  timeout  is a modifiable value that will measure the response time
71       of either the client or server JSV. In the event that the response time
72       of  the JSV is longer than timeout value specified, this will result in
73       the JSV being re-started. The server JSV  timeout  value  is  specified
74       through  the  qmaster  parameter  jsv_timeout.   The client JSV timeout
75       value is set through  the  environment  variable  SGE_JSV_TIMEOUT.  The
76       default  value is 10 seconds, and this value must be greater than 0. If
77       the timeout has been reach, the JSV will only try to re-start once,  if
78       the timeout is reached again an error will occur.
79

THRESHOLD

81       The  threshold  value  is defined as a qmaster parameter jsv_threshold.
82       This value measures the time for a server  job  verification.  If  this
83       time  exceeds the defined threshold then additional logging will appear
84       in the master message file at the INFO level. This value  is  specified
85       in  milliseconds  and  has  a default value of 5000. If a value of 0 is
86       defined then this means all jobs will be logged in the message file.
87

PROTOCOL

89       After a JSV script or binary is started it will  get  commands  through
90       its  stdin  stream  and  it has to respond with certain commands on the
91       stdout stream. Data which is send  via  the  stderr  stream  of  a  JSV
92       instance  is ignored. Each command which is send to/by a JSV script has
93       to be terminated by a new line character ('\n') whereas new line  char‐
94       acters are not allowed in the whole command string itself.
95
96       In  general commands which are exchanged between a JSV and client/qmas‐
97       ter have following format. Commands and arguments are  case  sensitive.
98       Find the EBNF command description below.
99
100             command := command_name ' ' { argument ' ' } ;
101
102       A command starts with a command_name followed by a space character  and
103       a space separated list of arguments.
104

PROTOCOL (JSV side)

106       Following commands have to be implemented by an JSV script so  that  it
107       conforms to version 1.0 of the JSV protocol which was first implemented
108       in Grid Engine 6.2u2:
109
110       begin_command := 'BEGIN' ;
111              After  a  JSV  instance  has  received  all   env_commands   and
112              param_commands   of   a   job  which  should  be  verified,  the
113              client/qmaster will trigger the verification process by  sending
114              one  begin_command.  After  that it will wait for param_commands
115              and env_commands which are sent back from the  JSV  instance  to
116              modify  the  job  specification.  As  part  of  the verification
117              process a JSV script or binary has to use the result_command  to
118              indicate that the verification process is finished for a job.
119
120       env_command := ENV ' ' modifier ' ' name ' ' value ;
121
122       modifier := 'ADD' | 'MOD' | 'DEL' ;
123              The  env_command  is  an  optional  command which has only to be
124              implemented by a JSV instance if the send_data_command  is  sent
125              by  this JSV before a the started_command was sent. Only in that
126              case the client or master will use one or multiple  env_commands
127              to  pass  the  environment variables (name and value) to the JSV
128              instance which would be exported to the job environment when the
129              job would be started. Client and qmaster will only sent env_com‐
130              mands with the modifier 'ADD'.
131
132              JSV instances modify the set of environment variables by sending
133              back env_commands and by using the modifiers ADD, MOD and DEL.
134
135       param_command := 'PARAM' ' ' param_parameter ' ' value ;
136
137       param_parameter := submit_parameter | pseudo_parameter ;
138              The  param_command  has two additional arguments which are sepa‐
139              rated by space characters. The first argument is either  a  sub‐
140              mit_parameter  as  it  is  specified  in  qsub(1)  or  it  is  a
141              pseudo_parameters as documented below.  The second parameter  is
142              the value of the corresponding param_parameter.
143
144              Multiple param_commands will be sent to a JSV instance after the
145              JSV has sent a started_command. The sum  of  all  param_commands
146              which  is  sent represents a job specification of that job which
147              should be verified.
148
149              submit_parameters are for example b  (similar  to  the  qsub  -b
150              switch)  or  masterq  (similar  to qsub -masterq switch). Find a
151              complete list of submit_parameters  in  the  qsub(1)  man  page.
152              Please  note  that not in all cases the param_parameter name and
153              the corresponding value  format  is  equivalent  with  the  qsub
154              switch  name and its argument format.  E.g. the qsub -pe parame‐
155              ters will by available as a set  of  parameters  with  the  name
156              pe_name,  pe_min, pe_max or the switch combination -soft -l will
157              be passed to JSV scripts as l_soft parameter. For  details  con‐
158              cerning this differences consult also the qsub(1) man page.
159
160       start_command := 'START' ;
161              The  start_command  has  no  additional  arguments. This command
162              indicates that a new job verification should be started.  It  is
163              the  first command which will be sent to JSV script after it has
164              been started and it will initiate each new job  verification.  A
165              JSV  instance  might  trash cached values which are still stored
166              due to a previous job verification. The application  which  send
167              the start_command will wait for a started_command before it con‐
168              tinues.
169
170       quit_command := 'QUIT' ;
171              The quit_command has no additional arguments. If this command is
172              sent  to  a JSV instance then it should terminate itself immedi‐
173              ately.
174

PROTOCOL (client/qmaster side)

176       A JSV script or binary can send a set of commands to  a  client/qmaster
177       process  to  indicate its state in the communication process, to change
178       the job specification of a job which should be verified and  to  report
179       messages  or  errors.  Below you can find the commands which are under‐
180       stood by the client/qmaster which will implement  version  1.0  of  the
181       communication  protocol  which  was  first  implemented  in Grid Engine
182       6.2u2:
183
184       error_command := 'ERROR' message ;
185              Any time a JSV script encounters an error it might report it  to
186              the  client/qmaster. If the error happens during a job verifica‐
187              tion the job which is currently verified will be  rejected.  The
188              JSV binary or script will also be restarted before it gets a new
189              verification task.
190
191       log_command := 'LOG' log_level ;
192
193       log_level := 'INFO' | 'WARNING' | 'ERROR'
194              log_commands can be used whenever the client or qmaster  expects
195              input  from  a  JSV instance. This command can be used in client
196              JSVs to send information to the  user  submitting  the  job.  In
197              client  JSVs  all messages, independent of the log_level will be
198              printed to the stdout stream of the used  submit  client.  If  a
199              server  JSV receives a log_command it will add the received mes‐
200              sage to the message file  respecting  the  specified  log_level.
201              Please  note  that  message might contain spaces but no new line
202              characters.
203
204       param_command (find definition above)
205              By sending param_commands a JSV script can change the job speci‐
206              fication  of the job which should be verified. If a JSV instance
207              later on sends a  result_command  which  indicates  that  a  JSV
208              instance should be accepted with correction then the values pro‐
209              vided with these param_commands will be used to modify  the  job
210              before it is accepted by the Grid Engine system.
211
212       result_command := 'RESULT' result_type [ message ] ;
213
214       result_type := 'ACCEPT' | 'CORRECT' | 'REJECT' | 'REJECT_WAIT' ;
215              After  the  verification of a job is done a JSV script or binary
216              has to send a result_command which indicates what should  happen
217              with  the  job.   If the result_type is ACCEPTED the job will be
218              accepted as it was initially submitted  by  the  end  user.  All
219              param_commands  and  env_commands  which  might  have  been sent
220              before  the  result_command  are  ignored  in  this  case.   The
221              result_type  CORRECT  indicates  that the job should be accepted
222              after all modifications sent via param_commands and env_commands
223              are applied to the job.  REJECT and REJECT_WAIT cause the client
224              or qmaster instance to reject the job.
225
226       send_data_command := 'SEND' data_name ;
227
228       data_name := 'ENV';
229              If a client/qmaster  receives  a  send_env_command  from  a  JSV
230              instance before a started_command is sent, then it will not only
231              pass job parameters with param_commands  but  also  env_commands
232              which  provide  the  JSV  with the information which environment
233              variables would be exported to the job environment if the job is
234              accepted and started later on.
235
236              The  job  environment  is not passed to JSV instances as default
237              because the job environment of the end user might  contain  data
238              which  might  be  interpreted wrong in the JSV context and might
239              therefore cause errors or security issues.
240
241       started_command := 'STARTED' ;
242              By sending the started_command a JSV instance indicates that  it
243              is  ready  to  receive param_commands and env_commands for a new
244              job verification. It will only receive env_commands if it  sends
245              a send_data_command before the started_command.
246

PSEUDO PARAMETERS

248       CLIENT The  corresponding  value  for  the  CLIENT parameters is either
249              'qmaster' or the name of a submit client like 'qsub',
250               'qsh', 'qrsh', 'qlogin' and so on This parameter value can't be
251              changed  by  JSV instances.  It will always be sent as part of a
252              job verification.
253
254       CMDARGS
255              Number of arguments which will be passed to the  job  script  or
256              command   when  the job execution is started.  It will always be
257              sent as part of a job verification.  If no arguments  should  be
258              passed  to  the  job script or command it will have the value 0.
259              This parameter can be changed by JSV instances. If the value  of
260              CMDARGS is bigger than the number of available CMDARG<id> param‐
261              eters then the missing parameters will be  automatically  passed
262              as empty parameters to the job script.
263
264       CMDNAME
265              Either  the  path  to  the script or the command name in case of
266              binary submission.  It will always be sent as part of a job ver‐
267              ification.
268
269       CONTEXT
270              Either 'client' if the JSV which receives this param_command was
271              started by a commandline client like qsub, qsh, ... or  'master'
272              if it was started by the sge_qmaster process.  It will always be
273              sent as part of a job verification.  Changing the value of  this
274              parameters is not possible within JSV instances.
275
276       GROUP  Defines  Primary group of the user which tries to submit the job
277              which should be verified. This parameter cannot be  changed  but
278              is  always  sent  as  part of the verification process. The user
279              name is passed as parameters with the name USER.
280
281       JOB_ID Not available in the client context (see CONTEXT). Otherwise  it
282              contains  the  job  number of the job which will be submitted to
283              Grid Engine when the verification process is successful.  JOB_ID
284              is   an  optional  parameter  which  can't  be  changed  by  JSV
285              instances.
286
287       USER   Username of the user which tries to submit the job which  should
288              be verified. Cannot be changed but is always sent as part of the
289              verification process. The group name   is  passed  as  parameter
290              with the name GROUP
291
292       VERSION
293              VERSION  will  always  be  sent  as  part  of a job verification
294              process and it will always be the first parameter which is sent.
295              It  will contain a version number of the format <major>.<minor>.
296              In version 6.2u2 and higher the value will be '1.0'.  The  value
297              of this parameter can't be changed.
298

EXAMPLE

300       Here  is  an  example  for  the  communication  of  a client with a JSV
301       instance when following job is submitted:
302
303       > qsub -pe p 3 -hard -l a=1,b=5 -soft -l q=all.q $SGE_ROOT/examples/jobs/sleeper.sh
304
305       Data in the first column is sent from the  client/qmaster  to  the  JSV
306       instance. That data contained in the second column is sent from the JSV
307       script to the client/qmaster. New line characters which terminate  each
308       line in the communication protocol are omitted.
309
310          START
311                                  SEND ENV
312                                  STARTED
313          PARAM VERSION 1.0
314          PARAM CONTEXT client
315          PARAM CLIENT qsub
316          PARAM USER ernst
317          PARAM GROUP staff
318          PARAM CMDNAME /sge_root/examples/jobs/sleeper.sh
319          PARAM CMDARGS 1
320          PARAM CMDARG0 12
321          PARAM l_hard a=1,b=5
322          PARAM l_soft q=all.q
323          PARAM M user@hostname
324          PARAM N Sleeper
325          PARAM o /dev/null
326          PARAM pe_name pe1
327          PARAM pe_min 3
328          PARAM pe_max 3
329          PARAM S /bin/sh
330          BEGIN
331                                  RESULT STATE ACCEPT
332
333

SEE ALSO

335       ge_intro(1),  qalter(1), qlogin(1), qmake(1), qrsh(1), qsh(1), qsub(1),
336       qtcsh(1),
337
339       See ge_intro(1) for a full statement of rights and permissions.
340
341
342
343GE 6.2u5                 $Date: 2009/08/25 19:39:34 $                   JSV(1)
Impressum