1QPING(1)                   Grid Engine User Commands                  QPING(1)
2
3
4

NAME

6       qping - check application status of Grid Engine daemons.
7

SYNTAX

9       qping [-help] [-noalias] [-ssl|-tcp] [ [ [-i <interval>] [-info] [-f] ]
10       | [ [-dump_tag tag [param] ] [-dump] [-nonewline]  ]  ]  <host>  <port>
11       <name> <id>
12

DESCRIPTION

14       Qping  is  used to validate the runtime status of a Grid Engine service
15       daemon. The current Grid Engine implementation allows one to query  the
16       GE_QMASTER daemon and any running GE_EXECD daemon. The qping command is
17       used to send a SIM (Status Information Message) to the destination dae‐
18       mon.  The communication layer of the specified daemon will respond with
19       a SIRM (Status Information  Response  Message)  which  contains  status
20       information about the consulted daemon.
21
22       The  qping  -dump  and  -dump_tag  options allowing an administrator to
23       observe the communication protocol data flow of a Grid  Engine  service
24       daemon.  The  qping -dump instruction must be started with root account
25       and on the same host where the observed daemon is running.
26

OPTIONS

28   -f
29       Show full status information on each ping interval.
30
31       First output Line: The first output line shows the date and time of the
32       request.
33
34       SIRM  version:  Internal version number of the SIRM (Status Information
35       Response Message)
36
37       SIRM message id: Current message id for this connection
38
39       start time: Start time of daemon. The format is as follows:
40
41       MM/DD/YYYY HH:MM:SS (seconds since 01.01.1970)
42
43       run time [s]: Run time in seconds since start time
44
45       messages in read buffer: Nr. of buffered messages in communication buf‐
46       fer.  The messages are buffered for the application (daemon). When this
47       number grows too large the daemon is not able to  handle  all  messages
48       sent to it.
49
50       messages in write buffer: Nr. of buffered messages in the communication
51       write buffer. The messages are sent from the  application  (daemon)  to
52       the  connected clients, but the communication layer wasn't able to send
53       the messages yet. If this number grows  too  large,  the  communication
54       layer isn't able to send them as fast as the application (daemon) wants
55       the messages to be sent.
56
57       nr. of connected clients:  This  is  the  number  of  actual  connected
58       clients to this daemon. This also implies the current qping connection.
59
60       status:  The  status  value  of  the  daemon. This value depends on the
61       application which reply to the qping request.  If the application  does
62       not provide any information the status is 99999.  Here are the possible
63       status information values for the Grid Engine daemons:
64
65          qmaster:
66
67             0 There is no unusual timing situation.
68
69             1 One or more threads has reached warning timeout. This may  hap‐
70             pen  when  at  least one thread does not increment his time stamp
71             for a not usual long time. A possible reason for this is  a  high
72             workload for this thread.
73
74             2  One or more threads has reached error timeout. This may happen
75             when at least one thread has not incremented his time  stamp  for
76             longer than 10 minutes.
77
78             3 The time measurement is not initialized.
79
80          execd:
81
82             0 There is no unusual timing situation.
83
84             1  Dispatcher  has  reached warning timeout. This may happen when
85             the dispatcher does not increment his time stamp  for  a  unusual
86             long time. A possible reason for this is a high workload.
87
88             2  Dispatcher has reached error timeout. This may happen when the
89             dispatcher has not incremented his time stamp for longer than  10
90             minutes.
91
92             3 The time measurement is not initialized.
93
94
95       info:  Status message of the daemon. This value depends on the applica‐
96       tion which reply to the qping request.  If  the  application  does  not
97       provide  any information the info message is "not available".  Here are
98       the possible status information values for the Grid Engine daemons:
99
100
101          qmaster:
102
103             The info message contains information about the  qmaster  threads
104             followed  by  a thread state and time information. Each time when
105             one of the known threads pass through their main  loop  the  time
106             information is updated. Since the qmaster has two message threads
107             every message thread updates the time. This means the timeout for
108             the  message thread (MT) can only occur when no message thread is
109             active anymore:
110
111                THREAD_NAME: THREAD_STATE (THREAD_TIME)
112
113                THREAD_NAME:
114                   MAIN: Main thread
115                   signaler: Signal thread
116                   event_master: Event master thread
117                   timer: Timer thread
118                   worker: Worker thread
119                   listener: Listener thread
120                   scheduler: Scheduler thread
121                   jvm: Java thread
122
123                   The thread names above will be followed by a 3 digit number.
124
125                THREAD_STATE:
126                   R: Running
127                   W: Warning
128                   E: Error
129
130                THREAD_TIME:
131                   Time since last timestamp updating.
132
133             After the dispatcher information follows an  additional  informa‐
134             tion string which describes the complete application status.
135
136          execd:
137
138             The  info  message  contains  information  for the execd job dis‐
139             patcher:
140                dispatcher: STATE (TIME)
141
142             STATE:
143                R: Running
144                W: Warning
145                E: Error
146
147             TIME:
148                Time since last timestamp updating.
149
150          After the  thread  information  follows  an  additional  information
151          string which describes the application status.
152
153    Monitor:  If available, displays statistics on a thread. The data for each
154    thread is displayed in one line. The format of this line can be changed at
155    any time. Only the master implements the monitoring.
156
157
158   -help
159       Prints a list of all options.
160
161
162   -i interval
163       Set qping interval time.
164
165       The  default interval time is one second. Qping will send a SIM (Status
166       Information Message) on each interval time.
167
168
169   -info
170       Show full status information (see -f for more  information)  and  exit.
171       The exit value 0 indicates no error. On errors qping returns with 1.
172
173
174   -noalias
175       Ignore  host_aliases  file,  which  is located at <ge_root>/<cell>/com‐
176       mon/host_aliases.  If this option is used it is not  necessary  to  set
177       any Grid Engine environment variable.
178
179
180   -ssl
181       This option can be used to specify an SSL (Secure Socket Layer) config‐
182       uration. The qping will use the configuration to  connect  to  services
183       running  SSL.  If the SGE settings file is not sourced, you have to use
184       the -noalias option to bypass the need  for  the  SGE_ROOT  environment
185       variable.  The following environment variables are used to specify your
186       certificates:
187         SSL_CA_CERT_FILE - CA certificate file
188         SSL_CERT_FILE    - certificates file
189         SSL_KEY_FILE     - key file
190         SSL_RAND_FILE    - rand file
191
192
193   -tcp
194       This option is used to select TCP/IP as the protocol used to connect to
195       other services.
196
197
198   -nonewline
199       Dump  output will not have a linebreak within a message and binary mes‐
200       sages are not unpacked.
201
202
203   -dump
204       This option allows an administrator to observe the communication proto‐
205       col data flow of a Grid Engine service daemon. The qping -dump instruc‐
206       tion must be started as root and on the same host  where  the  observed
207       daemon is running.
208
209       The   output   is   written   to   stdout.   The  environment  variable
210       "SGE_QPING_OUTPUT_FORMAT" can be set to hide  columns,  set  a  default
211       column width or to set a hostname output format. The value of the envi‐
212       ronment variable can be set to any combination of the following  speci‐
213       fiers separated by a space character:
214              "h:X"   -> hide column X
215              "s:X"   -> show column X
216              "w:X:Y" -> set width of column X to Y
217              "hn:X"  -> set hostname output parameter X.
218                         X values are "long" or "short"
219
220       Start qping -help to see which columns are available.
221
222
223
224   -dump_tag tag [param]
225       This  option  has  the  same the same meaning as -dump, but can provide
226       more information by specifying the debug level and message types  qping
227       should print:
228          -dump_tag ALL <debug level>
229             This  option  shows all possible debug messages (APP+MSG) for the
230             debug levels, ERROR, WARNING, INFO, DEBUG and DPRINTF.  The  con‐
231             tacted  service must support this kind of debugging.  This option
232             is not currently implemented.
233          -dump_tag APP <debug level>
234             This option shows only application debug messages for  the  debug
235             levels,  ERROR,  WARNING,  INFO, DEBUG and DPRINTF. The contacted
236             service must support this kind of debugging.  This option is  not
237             currently implemented.
238          -dump_tag MSG
239             This option has the same behavior as the -dump option.
240
241
242   host
243       Host where daemon is running.
244
245
246   port
247       Port which daemon has bound (used ge_qmaster/ge_execd port number).
248
249
250   name
251       Name  of communication endpoint ("qmaster" or "execd"). A communication
252       endpoint is a  triplet  of  hostname/endpoint  name/endpoint  id  (e.g.
253       hostA/qmaster/1 or subhost/qstat/4).
254
255
256   id
257       Id of communication endpoint ("1" for daemons)
258
259
260

EXAMPLES

262       >qping master_host 31116 qmaster
263       08/24/2004 16:41:15 endpoint master_host/qmaster/1 at port 31116 is up since 365761 seconds
264       08/24/2004 16:41:16 endpoint master_host/qmaster/1 at port 31116 is up since 365762 seconds
265       08/24/2004 16:41:17 endpoint master_host/qmaster/1 at port 31116 is up since 365763 seconds
266
267       > qping -info master_host 31116 qmaster 1
268       08/24/2004 16:42:47:
269       SIRM version:             0.1
270       SIRM message id:          1
271       start time:               08/20/2004 11:05:14 (1092992714)
272       run time [s]:             365853
273       messages in read buffer:  0
274       messages in write buffer: 0
275       nr. of connected clients: 4
276       status:                   0
277       info:                     ok
278
279       > qping -info execd_host 31117 execd 1
280       08/24/2004 16:43:45:
281       SIRM version:             0.1
282       SIRM message id:          1
283       start time:               08/20/2004 11:06:13 (1092992773)
284       run time [s]:             365852
285       messages in read buffer:  0
286       messages in write buffer: 0
287       nr. of connected clients: 2
288       status:                   0
289       info:                     ok
290
291
292

ENVIRONMENTAL VARIABLES

294       GE_ROOT        Specifies  the location of the Grid Engine standard con‐
295                      figuration files.
296
297       GE_CELL        If set, specifies the default Grid Engine cell.
298

SEE ALSO

300       ge_intro(1), host_aliases(5), ge_qmaster(8), ge_execd(8).
301
303       See ge_intro(1) for a full statement of rights and permissions.
304
305
306
307GE 6.2u5                 $Date: 2009/03/12 16:06:25 $                 QPING(1)
Impressum