1REPORTING(5)               Grid Engine File Formats               REPORTING(5)
2
3
4

NAME

6       reporting - Grid Engine reporting file format
7

DESCRIPTION

9       A  Grid Engine system writes a reporting file to $SGE_ROOT/default/com‐
10       mon/reporting.  The reporting file contains data that can be  used  for
11       accounting,  monitoring and analysis purposes.  It contains information
12       about the cluster (hosts,  queues,  load  values,  consumables,  etc.),
13       about the jobs running in the cluster and about sharetree configuration
14       and usage.  All information is time related, events are dumped  to  the
15       reporting  file  in  a  configurable  interval.  It allows to monitor a
16       "real time" status of the cluster as well as historical analysis.
17

FORMAT

19       The reporting file is an ASCII file.  Each line  contains  one  record,
20       and  the  fields  of  a  record  are separated by a delimiter (:).  The
21       reporting file contains records of different type. Each record type has
22       a specific record structure.
23
24       The first two fields are common to all reporting records:
25
26       time   Time (GMT unix timestamp) when the record was created.
27
28       record type
29              Type  of  the accounting record.  The different types of records
30              and their structure are described in the following text.
31
32   new_job
33       The new_job record is written whenever a  new  job  enters  the  system
34       (usually by a submitting command). It has the following fields:
35
36       submission_time
37              Time (GMT unix time stamp) when the job was submitted.
38
39       job_number
40              The job number.
41
42       task_number
43              The  array  task id. Always has the value -1 for new_job records
44              (as we don't have array tasks yet).
45
46       pe_taskid
47              The task id of parallel tasks. Always has the value  "none"  for
48              new_job records.
49
50       job_name
51              The job name (from -N submission option)
52
53       owner  The job owner.
54
55       group  The unix group of the job owner.
56
57       project
58              The project the job is running in.
59
60       department
61              The department the job owner is in.
62
63       account
64              The  account  string  specified  for the job (from -A submission
65              option).
66
67       priority
68              The job priority (from -p submission option).
69
70   job_log
71       The job_log record is written whenever a job, an array  task  or  a  pe
72       tasks  is  changing  status. A status change can be the transition from
73       pending to running, but can also be triggered by user actions like sus‐
74       pension of a job.  It has the following fields:
75
76       event_time
77              Time (GMT unix time stamp) when the event was generated.
78
79       event  A one word description of the event.
80
81       job_number
82              The job number.
83
84       task_number
85              The  array  task id. Always has the value -1 for new_job records
86              (as we don't have array tasks yet).
87
88       pe_taskid
89              The task id of parallel tasks. Always has the value  "none"  for
90              new_job records.
91
92       state  The state of the job after the event was processed.
93
94       user   The  user  who  initiated the event (or special usernames "qmas‐
95              ter", "scheduler" and "execd" for actions of the  system  itself
96              like scheduling jobs, executing jobs etc.).
97
98       host   The  host  from  which the action was initiated (e.g. the submit
99              host, the qmaster host, etc.).
100
101       state_time
102              Reserved field for later use.
103
104       submission_time
105              Time (GMT unix time stamp) when the job was submitted.
106
107       job_name
108              The job name (from -N submission option)
109
110       owner  The job owner.
111
112       group  The unix group of the job owner.
113
114       project
115              The project the job is running in.
116
117       department
118              The department the job owner is in.
119
120       account
121              The account string specified for the  job  (from  -A  submission
122              option).
123
124       priority
125              The job priority (from -p submission option).
126
127       message
128              A message describing the reported action.
129
130   acct
131       Records of type acct are accounting records. Normally, they are written
132       whenever a job, a task of an array job, or the task of a  parallel  job
133       terminates.  However, for long running jobs an intermediate acct record
134       is created once a day  after  a  midnight.  This  results  in  multiple
135       accounting  records  for a particular job and allows for a fine-grained
136       resource usage monitoring over time.  Accounting records  comprise  the
137       following fields:
138
139       qname  Name of the cluster queue in which the job has run.
140
141       hostname
142              Name of the execution host.
143
144       group  The effective group id of the job owner when executing the job.
145
146       owner  Owner of the Grid Engine job.
147
148       job_name
149              Job name.
150
151       job_number
152              Job identifier - job number.
153
154       account
155              An  account  string  as specified by the qsub(1) or qalter(1) -A
156              option.
157
158       priority
159              Priority value assigned to the job corresponding to the priority
160              parameter in the queue configuration (see queue_conf(5)).
161
162       submission_time
163              Submission time (GMT unix time stamp).
164
165       start_time
166              Start time (GMT unix time stamp).
167
168       end_time
169              End time (GMT unix time stamp).
170
171       failed Indicates  the problem which occurred in case a job could not be
172              started on the execution host (e.g. because the owner of the job
173              did  not  have  a valid account on that machine). If Grid Engine
174              tries to start a job multiple times, this may lead  to  multiple
175              entries in the accounting file corresponding to the same job ID.
176
177       exit_status
178              Exit status of the job script (or Grid Engine specific status in
179              case of certain error conditions).
180
181       ru_wallclock
182              Difference between end_time and start_time (see above).
183
184       The remainder of the accounting entries follows  the  contents  of  the
185       standard UNIX rusage structure as described in getrusage(2).  Depending
186       on the operating system where the job was executed some of  the  fields
187       may be 0.  The following entries are provided:
188
189              ru_utime
190              ru_stime
191              ru_maxrss
192              ru_ixrss
193              ru_ismrss
194              ru_idrss
195              ru_isrss
196              ru_minflt
197              ru_majflt
198              ru_nswap
199              ru_inblock
200              ru_oublock
201              ru_msgsnd
202              ru_msgrcv
203              ru_nsignals
204              ru_nvcsw
205              ru_nivcsw
206
207       project
208              The project which was assigned to the job.
209
210       department
211              The department which was assigned to the job.
212
213       granted_pe
214              The parallel environment which was selected for that job.
215
216       slots  The  number  of  slots  which  were dispatched to the job by the
217              scheduler.
218
219       task_number
220              Array job task index number.
221
222       cpu    The cpu time usage in seconds.
223
224       mem    The integral memory usage in Gbytes seconds.
225
226       io     The amount of data transferred in input/output operations.
227
228       category
229              A string specifying the job category.
230
231       iow    The io wait time in seconds.
232
233       pe_taskid
234              If this identifier is set the task was part of  a  parallel  job
235              and was passed to Grid Engine via the qrsh -inherit interface.
236
237       maxvmem
238              The maximum vmem size in bytes.
239
240       arid   Advance  reservation identifier. If the job used resources of an
241              advance reservation then this field contains a positive  integer
242              identifier otherwise the value is "0" .
243
244   queue
245       Records  of  type  queue  contain  state  information for queues (queue
246       instances).  A queue record has the following fields:
247
248       qname  The cluster queue name.
249
250       hostname
251              The hostname of a specific queue instance.
252
253       report_time
254              The time (GMT unix time stamp) when a  state  change  was  trig‐
255              gered.
256
257       state  The new queue state.
258
259   queue_consumable
260       A  queue_consumable  record contains information about queue consumable
261       values in addition to queue state information:
262
263       qname  The cluster queue name.
264
265       hostname
266              The hostname of a specific queue instance.
267
268       report_time
269              The time (GMT unix time stamp) when a  state  change  was  trig‐
270              gered.
271
272       state  The new queue state.
273
274       consumables
275              Description  of  consumable  values.  Information about multiple
276              consumables is separated by space.  A consumable description has
277              the format <name>=<actual_value>=<configured value>.
278
279   host
280       A  host  record  contains information about hosts and host load values.
281       It contains the following information:
282
283       hostname
284              The name of the host.
285
286       report_time
287              The time (GMT unix time stamp) when the reported information was
288              generated.
289
290       state  The new host state.  Currently, Grid Engine doesn't track a host
291              state, the field is reserved for future use. Always contains the
292              value X.
293
294       load values
295              Description of load values. Information about multiple load val‐
296              ues is separated by space.  A load  value  description  has  the
297              format <name>=<actual_value>.
298
299   host_consumable
300       A host_consumable record contains information about hosts and host con‐
301       sumables.  Host consumables can for example be licenses.   It  contains
302       the following information:
303
304       hostname
305              The name of the host.
306
307       report_time
308              The time (GMT unix time stamp) when the reported information was
309              generated.
310
311       state  The new host state.  Currently, Grid Engine doesn't track a host
312              state, the field is reserved for future use. Always contains the
313              value X.
314
315       consumables
316              Description of consumable  values.  Information  about  multiple
317              consumables is separated by space.  A consumable description has
318              the format <name>=<actual_value>=<configured value>.
319
320   sharelog
321       The Grid Engine qmaster can dump information about sharetree configura‐
322       tion  and  use  to  the reporting file.  The parameter sharelog sets an
323       interval in which sharetree information will be dumped.  It is  set  in
324       the format HH:MM:SS. A value of 00:00:00 configures qmaster not to dump
325       sharetree information. Intervals of several minutes  up  to  hours  are
326       sensible  values for this parameter.  The record contains the following
327       fields
328
329       current time
330              The present time
331
332       usage time
333              The  time used so far
334
335       node name
336              The node name
337
338       user name
339              The user name
340
341       project name
342              The project name
343
344       shares The total shares
345
346       job count
347              The job  count
348
349       level  The percentage of shares used
350
351       total  The adjusted percentage of shares used
352
353       long target share
354              The long target percentage of resource shares used
355
356       short target share
357              The short target percentage of resource shares used
358
359       actual share
360              The actual percentage of resource shares used
361
362       usage  The combined shares used
363
364       cpu    The cpu used
365
366       mem    The memory used
367
368       io     The IO used
369
370       long target cpu
371              The long target cpu used
372
373       long target mem
374              The long target memory used
375
376       long target io
377              The long target IO used
378
379   new_ar
380       A new_ar record contains information about advance reservation objects.
381       Entries  of  this  type will be added if an advance reservation is cre‐
382       ated.  It contains the following information:
383
384       submission_time
385              The time (GMT unix time stamp) when the advance reservation  was
386              created.
387
388       ar_number
389              The advance reservation number identifying the reservation.
390
391       ar_owner
392              The owner of the advance reservation.
393
394   ar_attribute
395       The  ar_attribute  record is written whenever a new advance reservation
396       was added or the attribute  of  an  existing  advance  reservation  has
397       changed. It has following fields.
398
399       event_time
400              The time (GMT unix time stamp) when the event was generated.
401
402       submission_time
403              The  time (GMT unix time stamp) when the advance reservation was
404              created.
405
406       ar_number
407              The advance reservation number identifying the reservation.
408
409       ar_name
410              Name of the advance reservation.
411
412       ar_account
413              An account string which was specified during the creation of the
414              advance reservation.
415
416       ar_start_time
417              Start time.
418
419       ar_end_time
420              End time.
421
422       ar_granted_pe
423              The  parallel  environment  which  was  selected  for an advance
424              reservation.
425
426       ar_granted_resources
427              The granted resources which were selected for an advance  reser‐
428              vation.
429
430   ar_log
431       The ar_log record is written whenever a advance reservation is changing
432       status. A status change can be from pending to active, but can also  be
433       triggered by system events like host outage. It has following fields.
434
435       ar_state_change_time
436              The  time  (GMT  unix  time stamp) when the event occurred which
437              caused a state change.
438
439       submission_time
440              The time (GMT unix time stamp) when the advance reservation  was
441              created.
442
443       ar_number
444              The advance reservation number identifying the reservation.
445
446       ar_state
447              The new state.
448
449       ar_event
450              An event id identifying the event which caused the state change.
451
452       ar_message
453              A message describing the event which caused the state change.
454
455   ar_acct
456       The  ar_acct records are accounting records which are written for every
457       queue instance  whenever  a  advance  reservation  terminates.  Advance
458       reservation accounting records comprise following fields.
459
460       ar_termination_time
461              The time (GMT unix time stamp) when the advance reservation ter‐
462              minated.
463
464       submission_time
465              The time (GMT unix time stamp) when the advance reservation  was
466              created.
467
468       ar_number
469              The advance reservation number identifying the reservation.
470
471       ar_qname
472              Cluster queue name which the advance reservation reserved.
473
474       ar_hostname
475              The name of the execution host.
476
477       ar_slots
478              The number of slots which were reserved.
479

SEE ALSO

481       sge_conf(5).  host_conf(5).
482
484       See ge_intro(1) for a full statement of rights and permissions.
485
486
487
488GE 6.2u5                 $Date: 2008/04/22 15:49:02 $             REPORTING(5)
Impressum