1drmaa_wait(3) Grid Engine DRMAA drmaa_wait(3)
2
3
4
6 drmaa_synchronize, drmaa_wait, drmaa_wifexited, drmaa_wexitstatus,
7 drmaa_wifsignaled, drmaa_wtermsig, drmaa_wcoredump, drmaa_wifaborted -
8 Waiting for jobs to finish
9
11 #include "drmaa.h"
12
13 int drmaa_synchronize(
14 const char *job_ids[],
15 signed long timeout,
16 int dispose,
17 char *error_diagnosis,
18 size_t error_diag_len
19 );
20
21 int drmaa_wait(
22 const char *job_id,
23 char *job_id_out,
24 size_t job_id_out_len,
25 int *stat,
26 signed long timeout,
27 drmaa_attr_values_t **rusage,
28 char *error_diagnosis,
29 size_t error_diagnois_len
30 );
31
32 int drmaa_wifaborted(
33 int *aborted,
34 int stat,
35 char *error_diagnosis,
36 size_t error_diag_len
37 );
38
39 int drmaa_wifexited(
40 int *exited,
41 int stat,
42 char *error_diagnosis,
43 size_t error_diag_len
44 );
45
46 int drmaa_wifsignaled(
47 int *signaled,
48 int stat,
49 char *error_diagnosis,
50 size_t error_diag_len
51 );
52
53 int drmaa_wcoredump(
54 int *core_dumped,
55 int stat,
56 char *error_diagnosis,
57 size_t error_diag_len
58 );
59
60 int drmaa_wexitstatus(
61 int *exit_status,
62 int stat,
63 char *error_diagnosis,
64 size_t error_diag_len
65 );
66
67 int drmaa_wtermsig(
68 char *signal,
69 size_t signal_len,
70 int stat,
71 char *error_diagnosis,
72 size_t error_diag_len
73 );
74
76 The drmaa_synchronize() function blocks the calling thread until all
77 jobs specified in job_ids have failed or finished execution. If job_ids
78 contains 'DRMAA_JOB_IDS_SESSION_ALL', then this function waits for all
79 jobs submitted during this DRMAA session. The job_ids pointer array
80 must be NULL terminated.
81
82 To prevent blocking indefinitely in this call, the caller may use the
83 timeout, specifying how many seconds to wait for this call to complete
84 before timing out. The special value DRMAA_TIMEOUT_WAIT_FOREVER can be
85 used to wait indefinitely for a result. The special value DRMAA_TIME‐
86 OUT_NO_WAIT can be used to return immediately. If the call exits
87 before timeout seconds, all the specified jobs have completed or the
88 calling thread received an interrupt. In both cases, the return code
89 is DRMAA_ERRNO_EXIT_TIMEOUT.
90
91 The dispose parameter specifies how to treat reaping information. If
92 '0' is passed to this parameter, job finish information will still be
93 available when drmaa_wait(3) is used. If '1' is passed, drmaa_wait(3)
94 will be unable to access this job's finish information.
95
96 drmaa_wait()
97 The drmaa_wait() function blocks the calling thread until a job fails
98 or finishes execution. This routine is modeled on the wait4(3) rou‐
99 tine. If the special string 'DRMAA_JOB_IDS_SESSION_ANY' is passed as
100 job_id, this routine will wait for any job from the session. Otherwise
101 the job_id must be the job identifier of a job or array job task that
102 was submitted during the session.
103
104 To prevent blocking indefinitely in this call, the caller may use time‐
105 out, specifying how many seconds to wait for this call to complete
106 before timing out. The special value DRMAA_TIMEOUT_WAIT_FOREVER can be
107 to wait indefinitely for a result. The special value DRMAA_TIME‐
108 OUT_NO_WAIT can be used to return immediately. If the call exits
109 before timeout seconds have passed, all the specified jobs have com‐
110 pleted or the calling thread received an interrupt. In both cases, the
111 return code is DRMAA_ERRNO_EXIT_TIMEOUT.
112
113 The routine reaps jobs on a successful call, so any subsequent calls to
114 drmaa_wait(3) will fail returning a DRMAA_ERRNO_INVALID_JOB error,
115 meaning that the job has already been reaped. This error is the same
116 as if the job were unknown. Returning due to an elapsed timeout or an
117 interrupt does not cause the job information to be reaped. This means
118 that, in this case, it is possible to issue drmaa_wait(3) multiple
119 times for the same job_id.
120
121 If job_id_out is not a null pointer, then on return from a successful
122 drmaa_wait(3) call, up to job_id_out_len characters from the job id of
123 the failed or finished job are returned.
124
125 If stat is not a null pointer, then on return from a successful
126 drmaa_wait(3) call, the status of the job is stored in the integer
127 pointed to by stat. stat indicates whether job failed or finished and
128 other information. The information encoded in the integer value can be
129 accessed via drmaa_wifaborted(3) drmaa_wifexited(3) drmaa_wifsig‐
130 naled(3) drmaa_wcoredump(3) drmaa_wexitstatus(3) drmaa_wtermsig(3).
131
132 If rusage is not a null pointer, then on return from a successful
133 drmaa_wait(3) call, a summary of the resources used by the terminated
134 job is returned in form of a DRMAA values string vector. The entries
135 in the DRMAA values string vector can be extracted using
136 drmaa_get_next_attr_value(3). Each string returned by
137 drmaa_get_next_attr_value(3) will be of the format <name>=<value>,
138 where <name> and <value> specify name and amount of resources consumed
139 by the job, respectively. See accounting(5) for an explanation of the
140 resource information.
141
142 drmaa_wifaborted()
143 The drmaa_wifaborted() function evaluates into the integer pointed to
144 by aborted a non-zero value if stat was returned from a job that ended
145 before entering the running state.
146
147 drmaa_wifexited()
148 The drmaa_wifexited() function evaluates into the integer pointed to by
149 exited a non-zero value if stat was returned from a job that terminated
150 normally. A zero value can also indicate that although the job has ter‐
151 minated normally, an exit status is not available, or that it is not
152 known whether the job terminated normally. In both cases drmaa_wexit‐
153 status(3) will not provide exit status information. A non-zero value
154 returned in exited indicates more detailed diagnosis can be provided by
155 means of drmaa_wifsignaled(3), drmaa_wtermsig(3) and drmaa_wcore‐
156 dump(3).
157
158 drmaa_wifsignaled()
159 The drmaa_wifsignaled() function evaluates into the integer pointed to
160 by signaled a non-zero value if stat was returned for a job that termi‐
161 nated due to the receipt of a signal. A zero value can also indicate
162 that although the job has terminated due to the receipt of a signal,
163 the signal is not available, or it is not known whether the job termi‐
164 nated due to the receipt of a signal. In both cases drmaa_wtermsig(3)
165 will not provide signal information. A non-zero value returned in sig‐
166 naled indicates signal information can be retrieved by means of
167 drmaa_wtermsig(3).
168
169 drmaa_wcoredump()
170 If drmaa_wifsignaled(3) returned a non-zero value in the signaled
171 parameter, the drmaa_wcoredump() function evaluates into the integer
172 pointed to by core_dumped a non-zero value if a core image of the ter‐
173 minated job was created.
174
175 drmaa_wexitstatus()
176 If drmaa_wifexited(3) returned a non-zero value in the exited parame‐
177 ter, the drmaa_wexitstatus() function evaluates into the integer
178 pointed to by exit_code the exit code that the job passed to exit(2) or
179 the value that the child process returned from main.
180
181 drmaa_wtermsig()
182 If drmaa_wifsignaled(3) returned a non-zero value in the signaled
183 parameter, the drmaa_wtermsig() function evaluates into signal up to
184 signal_len characters of a string representation of the signal that
185 caused the termination of the job. For signals declared by POSIX.1, the
186 symbolic names are returned (e.g., SIGABRT, SIGALRM). For signals not
187 declared by POSIX, any other string may be returned.
188
190 GE_ROOT Specifies the location of the Grid Engine standard con‐
191 figuration files.
192
193 GE_CELL If set, specifies the default Grid Engine cell to be
194 used. To address a Grid Engine cell Grid Engine uses (in
195 the order of precedence):
196
197 The name of the cell specified in the environment
198 variable GE_CELL, if it is set.
199
200 The name of the default cell, i.e. default.
201
202
203 GE_DEBUG_LEVEL If set, specifies that debug information should be writ‐
204 ten to stderr. In addition the level of detail in which
205 debug information is generated is defined.
206
207 GE_QMASTER_PORT
208 If set, specifies the tcp port on which ge_qmaster(8) is
209 expected to listen for communication requests. Most
210 installations will use a services map entry instead to
211 define that port.
212
214 Upon successful completion, drmaa_run_job(), drmaa_run_bulk_jobs(), and
215 drmaa_get_next_job_id() return DRMAA_ERRNO_SUCCESS. Other values indi‐
216 cate an error. Up to error_diag_len characters of error related diag‐
217 nosis information is then provided in the buffer error_diagnosis.
218
220 The drmaa_synchronize(), drmaa_wait(), drmaa_wifexited(), drmaa_wexit‐
221 status(), drmaa_wifsignaled(), drmaa_wtermsig(), drmaa_wcoredump(), and
222 drmaa_wifaborted() will fail if:
223
224 DRMAA_ERRNO_INTERNAL_ERROR
225 Unexpected or internal DRMAA error, like system call failure, etc.
226
227 DRMAA_ERRNO_DRM_COMMUNICATION_FAILURE
228 Could not contact DRM system for this request.
229
230 DRMAA_ERRNO_AUTH_FAILURE
231 The specified request is not processed successfully due to authoriza‐
232 tion failure.
233
234 DRMAA_ERRNO_INVALID_ARGUMENT
235 The input value for an argument is invalid.
236
237 DRMAA_ERRNO_NO_ACTIVE_SESSION
238 Failed because there is no active session.
239
240 DRMAA_ERRNO_NO_MEMORY
241 Failed allocating memory.
242
243 The drmaa_synchronize() and drmaa_wait() functions will fail if:
244
245 DRMAA_ERRNO_EXIT_TIMEOUT
246 Time-out condition.
247
248 DRMAA_ERRNO_INVALID_JOB
249 The job specified by the does not exist.
250
251 The drmaa_wait() will fail if:
252
253 DRMAA_ERRNO_NO_RUSAGE
254 This error code is returned by drmaa_wait() when a job has finished but
255 no rusage and stat data could be provided.
256
258 drmaa_submit(3).
259
260
261
262GE 6.2u5 $Date: 2004/11/12 15:40:05 $ drmaa_wait(3)