1REPORTING(5)               Grid Engine File Formats               REPORTING(5)
2
3
4

NAME

6       reporting - Grid Engine reporting file format
7

DESCRIPTION

9       An Grid Engine system writes a reporting file to $SGE_ROOT/default/com‐
10       mon/reporting.  The reporting file contains data that can be  used  for
11       accounting,  monitoring and analysis purposes.  It contains information
12       about the cluster (hosts,  queues,  load  values,  consumables,  etc.),
13       about the jobs running in the cluster and about sharetree configuration
14       and usage.  All information is time related, events are dumped  to  the
15       reporting  file  in  a  configurable  interval.  It allows to monitor a
16       "real time" status of the cluster as well as historical analysis.
17

FORMAT

19       The reporting file is an ASCII file.  Each line  contains  one  record,
20       and  the  fields  of  a  record  are separated by a delimiter (:).  The
21       reporting file contains records of different type. Each record type has
22       a specific record structure.
23
24       The first two fields are common to all reporting records:
25
26       time   Time (GMT Unix timestamp) when the record was created.
27
28       record type
29              Type  of  the accounting record.  The different types of records
30              and their structure are described in the following text.
31
32   new_job
33       The new_job record is written whenever a  new  job  enters  the  system
34       (usually by a submitting command). It has the following fields:
35
36       submission_time
37              Time (GMT Unix time stamp) when the job was submitted.
38
39       job_number
40              The job number.
41
42       task_number
43              The  array  task id. It has the value -1 for new_job records (as
44              we don't have array tasks yet).
45
46       pe_taskid
47              The task id of parallel tasks.  It  has  the  value  "none"  for
48              new_job records.
49
50       job_name
51              The job name (from -N submission option)
52
53       owner  The job owner.
54
55       group  The Unix group of the job owner.
56
57       project
58              The project the job is running in.
59
60       department
61              The department the job owner is in.
62
63       account
64              The  account  string  specified  for the job (from -A submission
65              option).
66
67       priority
68              The job priority (from -p submission option).
69
70   job_log
71       The job_log record is written whenever a job, an array  task  or  a  pe
72       tasks  is  changing  status. A status change can be the transition from
73       pending to running, but can also be triggered by user actions like sus‐
74       pension of a job.  It has the following fields:
75
76       event_time
77              Time (GMT Unix time stamp) when the event was generated.
78
79       event  A one word description of the event.
80
81       job_number
82              The job number.
83
84       task_number
85              The  array  task id. It has the value -1 for new_job records (as
86              we don't have array tasks yet).
87
88       pe_taskid
89              The task id of parallel tasks.  It  has  the  value  "none"  for
90              new_job records.
91
92       state  The state of the job after the event was processed.
93
94       user   The  user  who  initiated the event (or special usernames "qmas‐
95              ter", "scheduler" and "execd" for actions of the  system  itself
96              like scheduling jobs, executing jobs etc.).
97
98       host   The  host  from  which the action was initiated (e.g. the submit
99              host, the qmaster host, etc.).
100
101       state_time
102              Reserved field for later use.
103
104       submission_time
105              Time (GMT Unix time stamp) when the job was submitted.
106
107       job_name
108              The job name (from -N submission option)
109
110       owner  The job owner.
111
112       group  The Unix group of the job owner.
113
114       project
115              The project the job is running in.
116
117       department
118              The department the job owner is in.
119
120       account
121              The account string specified for the  job  (from  -A  submission
122              option).
123
124       priority
125              The job priority (from -p submission option).
126
127       message
128              A message describing the reported action.
129
130   acct
131       Records  of type acct are accounting records. They are written whenever
132       a job, a task of an array job or the task of a parallel job terminates.
133       Accounting records comprise the following fields:
134
135       qname  Name of the cluster queue in which the job has run.
136
137       hostname
138              Name of the execution host.
139
140       group  The effective group id of the job owner when executing the job.
141
142       owner  Owner of the Grid Engine job.
143
144       job_name
145              Job name.
146
147       job_number
148              Job identifier - job number.
149
150       account
151              An  account  string  as specified by the qsub(1) or qalter(1) -A
152              option.
153
154       priority
155              Priority value assigned to the job corresponding to the priority
156              parameter in the queue configuration (see queue_conf(5)).
157
158       submission_time
159              Submission  time  (GMT  Unix  time  stamp).   For slave tasks of
160              tightly integrated parallel jobs, the submission_time is set  to
161              0.
162
163       start_time
164              Start time (GMT Unix time stamp).
165
166       end_time
167              End time (GMT Unix time stamp).
168
169       failed Indicates  the problem which occurred in case a job could not be
170              started on the execution host (e.g. because the owner of the job
171              did  not  have  a valid account on that machine). If Grid Engine
172              tries to start a job multiple times, this may lead  to  multiple
173              entries in the accounting file corresponding to the same job ID.
174
175       exit_status
176              Exit status of the job script (or Grid Engine specific status in
177              case of certain error conditions).
178
179       ru_wallclock
180              Difference between end_time and start_time (see above).
181
182       The remainder of the accounting entries follows  the  contents  of  the
183       standard UNIX rusage structure as described in getrusage(2).  Depending
184       on the operating system where the job was executed some of  the  fields
185       may be 0.  The following entries are provided:
186
187              ru_utime
188              ru_stime
189              ru_maxrss
190              ru_ixrss
191              ru_ismrss
192              ru_idrss
193              ru_isrss
194              ru_minflt
195              ru_majflt
196              ru_nswap
197              ru_inblock
198              ru_oublock
199              ru_msgsnd
200              ru_msgrcv
201              ru_nsignals
202              ru_nvcsw
203              ru_nivcsw
204
205       project
206              The project which was assigned to the job.
207
208       department
209              The department which was assigned to the job.
210
211       granted_pe
212              The parallel environment which was selected for that job.
213
214       slots  The  number  of  slots  which  were dispatched to the job by the
215              scheduler.
216
217       task_number
218              Array job task index number.
219
220       cpu    The cpu time usage in seconds.
221
222       mem    The integral memory usage in Gbytes seconds.
223
224       io     The amount of data transferred in input/output operations.
225
226       category
227              A string specifying the job category.
228
229       iow    The io wait time in seconds.
230
231       pe_taskid
232              If this identifier is set the task was part of  a  parallel  job
233              and was passed to Grid Engine via the qrsh -inherit interface.
234
235       maxvmem
236              The maximum vmem size in bytes.
237
238   queue
239       Records  of  type  queue  contain  state  information for queues (queue
240       instances).  A queue record has the following fields:
241
242       qname  The cluster queue name.
243
244       hostname
245              The hostname of a specific queue instance.
246
247       report_time
248              The time (GMT Unix time stamp) when a  state  change  was  trig‐
249              gered.
250
251       state  The new queue state.
252
253   queue_consumable
254       A  queue_consumable  record contains information about queue consumable
255       values in addition to queue state information:
256
257       qname  The cluster queue name.
258
259       hostname
260              The hostname of a specific queue instance.
261
262       report_time
263              The time (GMT Unix time stamp) when a  state  change  was  trig‐
264              gered.
265
266       state  The new queue state.
267
268       consumables
269              Description  of  consumable  values.  Information about multiple
270              consumables is separated by space.  A consumable description has
271              the format <name>=<actual_value>=<configured value>.
272
273   host
274       A  host  record  contains information about hosts and host load values.
275       It contains the following information:
276
277       hostname
278              The name of the host.
279
280       report_time
281              The time (GMT Unix time stamp) when the reported information was
282              generated.
283
284       state  The new host state.  Currently, Grid Engine doesn't track a host
285              state, the field is reserved for future  use.  It  contains  the
286              value X.
287
288       load values
289              Description of load values. Information about multiple load val‐
290              ues is separated by space.  A load  value  description  has  the
291              format <name>=<actual_value>.
292
293   host_consumable
294       A host_consumable record contains information about hosts and host con‐
295       sumables.  Host consumables can for example be licenses.   It  contains
296       the following information:
297
298       hostname
299              The name of the host.
300
301       report_time
302              The time (GMT Unix time stamp) when the reported information was
303              generated.
304
305       state  The new host state.  Currently, Grid Engine doesn't track a host
306              state,  the  field  is  reserved for future use. It contains the
307              value X.
308
309       consumables
310              Description of consumable  values.  Information  about  multiple
311              consumables is separated by space.  A consumable description has
312              the format <name>=<actual_value>=<configured value>.
313
314   sharelog
315       The Grid Engine qmaster can dump information about sharetree configura‐
316       tion  and  use  to  the reporting file.  The parameter sharelog sets an
317       interval in which sharetree information will be dumped.  It is  set  in
318       the format HH:MM:SS. A value of 00:00:00 configures qmaster not to dump
319       sharetree information. Intervals of several minutes  up  to  hours  are
320       sensible  values for this parameter.  The record contains the following
321       fields
322
323       current time
324              The present time
325
326       usage time
327              The  time used so far
328
329       node name
330              The node name
331
332       user name
333              The user name
334
335       project name
336              The project name
337
338       shares The total shares
339
340       job count
341              The job  count
342
343       level  The percentage of shares used
344
345       total  The adjusted percentage of shares used
346
347       long target share
348              The long target percentage of resource shares used
349
350       short target share
351              The short target percentage of resource shares used
352
353       actual share
354              The actual percentage of resource shares used
355
356       usage  The combined shares used
357
358       cpu    The cpu used
359
360       mem    The memory used
361
362       io     The IO used
363
364       long target cpu
365              The long target cpu used
366
367       long target mem
368              The long target memory used
369
370       long target io
371              The long target IO used
372
373   new_ar
374       A new_ar record contains information about advance reservation objects.
375       Entries  of  this  type will be added if an advance reservation is cre‐
376       ated.  It contains the following information:
377
378       submission_time
379              The time (GMT unix time stamp) when the advance reservation  was
380              created.
381
382       ar_number
383              The advance reservation number identifying the reservation.
384
385       ar_owner
386              The owner of the advance reservation.
387
388   ar_attribute
389       The  ar_attribute  record is written whenever a new advance reservation
390       was added or the attribute  of  an  existing  advance  reservation  has
391       changed. It has following fields.
392
393       event_time
394              The time (GMT unix time stamp) when the event was generated.
395
396       ar_number
397              The advance reservation number identifying the reservation.
398
399       ar_name
400              Name of the advance reservation.
401
402       ar_account
403              An account string which was specified during the creation of the
404              advance reservation.
405
406       ar_start_time
407              Start time.
408
409       ar_end_time
410              End time.
411
412       ar_granted_pe
413              The parallel environment  which  was  selected  for  an  advance
414              reservation.
415
416       ar_granted_resources
417              The  granted resources which were selected for an advance reser‐
418              vation.
419
420   ar_log
421       The ar_log record is written whenever a advance reservation is changing
422       status.  A status change can be from pending to active, but can also be
423       triggered by system events like host outage. It has following fields.
424
425       ar_state_change_time
426              The time (GMT unix time stamp) when  the  event  occurred  which
427              caused a state change.
428
429       ar_number
430              The advance reservation number identifying the reservation.
431
432       ar_state
433              The new state.
434
435       ar_event
436              An event id identifying the event which caused the state change.
437
438       ar_message
439              A message describing the event which caused the state change.
440
441   ar_acct
442       The  ar_acct records are accounting records which are written for every
443       queue instance  whenever  a  advance  reservation  terminates.  Advance
444       reservation accounting records comprise following fields.
445
446       ar_termination_time
447              The time (GMT unix time stamp) when the advance reservation ter‐
448              minated.
449
450       ar_number
451              The advance reservation number identifying the reservation.
452
453       ar_qname
454              Cluster queue name which the advance reservation reserved.
455
456       ar_hostname
457              The name of the execution host.
458
459       ar_slots
460              The number of slots which were reserved.
461

SEE ALSO

463       sge_conf(5).  host_conf(5).
464
466       See sge_intro(1) for a full statement of rights and permissions.
467
468
469
470GE 6.1                   $Date: 2007/07/19 08:17:18 $             REPORTING(5)
Impressum