1QPING(1)                   Grid Engine User Commands                  QPING(1)
2
3
4

NAME

6       qping - check application status of Grid Engine daemons.
7

SYNTAX

9       qping [-help] [-noalias] [-ssl|-tcp] [ [ [-i <interval>] [-info] [-f] ]
10       | [ [-dump_tag tag [param] ] [-dump] [-nonewline]  ]  ]  <host>  <port>
11       <name> <id>
12

DESCRIPTION

14       qping  is  used to validate the runtime status of a Grid Engine service
15       daemon. The current Grid Engine implementation allows one to query  the
16       SGE_QMASTER  daemon and any running SGE_EXECD daemon. The qping command
17       is used to send a SIM (Status Information Message) to  the  destination
18       daemon.  The  communication  layer of the specified daemon will respond
19       with a SIRM (Status Information Response Message) which contains status
20       information about the consulted daemon.
21
22       The  qping  -dump  and  -dump_tag  options allowing an administrator to
23       observe the communication protocol data flow of a Grid  Engine  service
24       daemon.  The  qping -dump instruction must be started with root account
25       and on the same host where the observed daemon is running.
26

OPTIONS

28   -f
29       Show full status information on each ping interval.
30
31       First output Line: The first output line shows the date and time of the
32       request.
33
34       SIRM  version:  Internal version number of the SIRM (Status Information
35       Response Message)
36
37       SIRM message id: Current message id for this connection
38
39       start time: Start time of daemon. The format is as follows:
40
41       MM/DD/YYYY HH:MM:SS (seconds since 01.01.1970)
42
43       run time [s]: Run time in seconds since start time
44
45       messages in read buffer: Nr. of buffered messages in communication buf‐
46       fer.  The messages are buffered for the application (daemon). When this
47       number grows too large the daemon is not able to  handle  all  messages
48       sent to it.
49
50       messages in write buffer: Nr. of buffered messages in the communication
51       write buffer. The messages are sent from the  application  (daemon)  to
52       the  connected clients, but the communication layer wasn't able to send
53       the messages yet. If this number grows  too  large,  the  communication
54       layer isn't able to send them as fast as the application (daemon) wants
55       the messages to be sent.
56
57       nr. of connected clients:  This  is  the  number  of  actual  connected
58       clients to this daemon. This also implies the current qping connection.
59
60       status:  The  status  value  of  the  daemon. This value depends on the
61       application which reply to the qping request.  If the application  does
62       not provide any information the status is 99999.  Here are the possible
63       status information values for the Grid Engine daemons:
64
65       qmaster:
66
67       0 There is no unusual timing situation.
68
69       1 One or more threads has reached warning timeout. This may happen when
70       at  least one thread does not increment its time stamp for an unusually
71       long time. A possible reason for this  is  a  high  workload  for  this
72       thread.
73
74       2  One  or more threads has reached error timeout. This may happen when
75       at least one thread has not incremented his time stamp for longer  than
76       10 minutes.
77
78       3 The time measurement is not initialized.
79
80       execd:
81
82       0 There is no unusual timing situation.
83
84       1 Dispatcher has reached warning timeout. This may happen when the dis‐
85       patcher does not increment his time stamp for a unusual  long  time.  A
86       possible reason for this is a high workload.
87
88       2  Dispatcher  has reached error timeout. This may happen when the dis‐
89       patcher has not incremented his time stamp for longer than 10 minutes.
90
91       3 The time measurement is not initialized.
92
93
94       info: Status message of the daemon. This value depends on the  applica‐
95       tion  which  reply  to  the qping request.  If the application does not
96       provide any information the info message is "not available".  Here  are
97       the possible status information values for the Grid Engine daemons:
98
99
100       qmaster:
101
102       The  info  message  contains information about the qmaster threads fol‐
103       lowed by a thread state and time information. Each time when one of the
104       known  threads  pass  through  their  main loop the time information is
105       updated. Since the qmaster has two message threads every message thread
106       updates  the  time.  This means the timeout for the message thread (MT)
107       can only occur when no message thread is active anymore:
108
109       THREAD_NAME: THREAD_STATE (THREAD_TIME)
110
111       THREAD_NAME:
112       EDT:  Event Delivery Thread
113       TET:  Timed Event Thread
114       MT:   Message Thread(s)
115       SIGT: SIGnal Thread
116
117       In addition to these thread names, the name can contain a thread number (for example:
118       MT(1)), when multiple instances of this thread are running.
119
120       THREAD_STATE:
121       R: Running
122       W: Warning
123       E: Error
124
125       THREAD_TIME:
126       Time since last timestamp updating.
127
128       After the dispatcher  information  follows  an  additional  information
129       string which describes the complete application status.
130
131       execd:
132
133       The  info  message  contains  information for the execd job dispatcher:
134       dispatcher: STATE (TIME)
135
136       STATE:
137       R: Running
138       W: Warning
139       E: Error
140
141       TIME:
142       Time since last timestamp updating.
143
144       After the thread information follows an additional  information  string
145       which describes the application status.
146
147       Monitor:  If  available,  displays statistics on a thread. The data for
148       each thread is displayed in one line. The format of this  line  can  be
149       changed at any time. Only the master implements the monitoring.
150
151
152   -help
153       Prints a list of all options.
154
155
156   -i interval
157       Set qping interval time.
158
159       The  default interval time is one second. qping will send a SIM (Status
160       Information Message) on each interval time.
161
162
163   -info
164       Show full status information (see -f for more  information)  and  exit.
165       The exit value 0 indicates no error. On errors qping returns with 1.
166
167
168   -noalias
169       Ignore  host_aliases file, which is located at $SGE_ROOT/$SGE_CELL/com‐
170       mon/host_aliases.  If this option is used it is not  necessary  to  set
171       any Grid Engine environment variable.
172
173
174   -ssl
175       This option can be used to specify an SSL (Secure Socket Layer) config‐
176       uration. The qping will use the configuration to  connect  to  services
177       running  SSL.  If the SGE settings file is not sourced, you have to use
178       the -noalias option to bypass the need  for  the  SGE_ROOT  environment
179       variable.  The following environment variables are used to specify your
180       certificates:
181         SSL_CA_CERT_FILE - CA certificate file
182         SSL_CERT_FILE    - certificates file
183         SSL_KEY_FILE     - key file
184         SSL_RAND_FILE    - rand file
185
186
187   -tcp
188       This option is used to select TCP/IP as the protocol used to connect to
189       other services.
190
191
192   -nonewline
193       Dump  output will not have a linebreak within a message and binary mes‐
194       sages are not unpacked.
195
196
197   -dump
198       This option allows an administrator to observe the communication proto‐
199       col data flow of a Grid Engine service daemon. The qping -dump instruc‐
200       tion must be started as root and on the same host  where  the  observed
201       daemon is running.
202
203       The   output   is   written   to   stdout.   The  environment  variable
204       "SGE_QPING_OUTPUT_FORMAT" can be set to hide  columns,  set  a  default
205       column width or to set a hostname output format. The value of the envi‐
206       ronment variable can be set to any combination of the following  speci‐
207       fiers separated by a space character:
208              "h:X"   -> hide column X
209              "s:X"   -> show column X
210              "w:X:Y" -> set width of column X to Y
211              "hn:X"  -> set hostname output parameter X.
212                         X values are "long" or "short"
213
214       Start qping -help to see which columns are available.
215
216
217
218   -dump_tag tag [param]
219       This  option  has  the  same the same meaning as -dump, but can provide
220       more information by specifying the debug level and message types  qping
221       should  print: -dump_tag ALL <debug level> This option shows all possi‐
222       ble debug messages (APP+MSG) for  the  debug  levels,  ERROR,  WARNING,
223       INFO,  DEBUG  and DPRINTF. The contacted service must support this kind
224       of debugging.  This option is not currently implemented.  -dump_tag APP
225       <debug level> This option shows only application debug messages for the
226       debug levels, ERROR, WARNING, INFO, DEBUG and  DPRINTF.  The  contacted
227       service  must  support this kind of debugging.  This option is not cur‐
228       rently implemented.  -dump_tag MSG This option has the same behavior as
229       the -dump option.
230
231
232   host
233       Host where daemon is running.
234
235
236   port
237       Port which daemon has bound (used sge_qmaster/sge_execd port number).
238
239
240   name
241       Name  of communication endpoint ("qmaster" or "execd"). A communication
242       endpoint is a  triplet  of  hostname/endpoint  name/endpoint  id  (e.g.
243       hostA/qmaster/1 or subhost/qstat/4).
244
245
246   id
247       Id of communication endpoint ("1" for daemons)
248
249
250

EXAMPLES

252       >qping master_host 31116 qmaster
253       08/24/2004 16:41:15 endpoint master_host/qmaster/1 at port 31116 is up since 365761 seconds
254       08/24/2004 16:41:16 endpoint master_host/qmaster/1 at port 31116 is up since 365762 seconds
255       08/24/2004 16:41:17 endpoint master_host/qmaster/1 at port 31116 is up since 365763 seconds
256
257       > qping -info master_host 31116 qmaster 1
258       08/24/2004 16:42:47:
259       SIRM version:             0.1
260       SIRM message id:          1
261       start time:               08/20/2004 11:05:14 (1092992714)
262       run time [s]:             365853
263       messages in read buffer:  0
264       messages in write buffer: 0
265       nr. of connected clients: 4
266       status:                   0
267       info:                     ok
268
269       > qping -info execd_host 31117 execd 1
270       08/24/2004 16:43:45:
271       SIRM version:             0.1
272       SIRM message id:          1
273       start time:               08/20/2004 11:06:13 (1092992773)
274       run time [s]:             365852
275       messages in read buffer:  0
276       messages in write buffer: 0
277       nr. of connected clients: 2
278       status:                   0
279       info:                     ok
280
281
282

ENVIRONMENTAL VARIABLES

284       SGE_ROOT       Specifies  the location of the Grid Engine standard con‐
285                      figuration files.
286
287       SGE_CELL       If set, specifies the default Grid Engine cell.
288

SEE ALSO

290       sge_intro(1), SGE_H_ALIASES(5), sge_qmaster(8), sge_execd(8).
291
293       See sge_intro(1) for a full statement of rights and permissions.
294
295
296
297GE 6.1                   $Date: 2007/07/19 08:17:15 $                 QPING(1)
Impressum