1SGE_SHADOWD(8) Grid Engine Administrative Commands SGE_SHADOWD(8)
2
3
4
6 sge_shadowd - Grid Engine shadow master daemon
7
9 sge_shadowd
10
12 sge_shadowd is a "light weight" process which can be run on shadow mas‐
13 ter hosts in a Grid Engine cluster to detect failure of the current
14 Grid Engine master daemon, sge_qmaster(8), and to start- p a new
15 sge_qmaster(8) on the host on which the sge_shadowd runs. If multiple
16 shadow daemons are active in a cluster, they run a protocol which
17 ensures that only one of them will start-up a new master daemon.
18
19 The hosts suitable for being used as shadow master hosts must have
20 shared root read/write access to the directory $SGE_ROOT/$SGE_CELL/com‐
21 mon as well as to the master daemon spool directory (by default
22 $SGE_ROOT/$SGE_CELL/spool/qmaster). The names of the shadow master
23 hosts need to be contained in the file
24 $SGE_ROOT/$xQS_NAME_Sxx_CELL/common/shadow_masters.
25
27 sge_shadowd may only be started by root.
28
30 SGE_ROOT Specifies the location of the Grid Engine standard con‐
31 figuration files.
32
33 SGE_CELL If set, specifies the default Grid Engine cell. To
34 address a Grid Engine cell sge_shadowd uses (in the
35 order of precedence):
36
37 The name of the cell specified in the environment
38 variable SGE_CELL, if it is set.
39
40 The name of the default cell, i.e. default.
41
42
43 SGE_DEBUG_LEVEL
44 If set, specifies that debug information should be writ‐
45 ten to stderr. In addition the level of detail in which
46 debug information is generated is defined.
47
48 SGE_QMASTER_PORT
49 If set, specifies the tcp port on which sge_qmaster(8)
50 is expected to listen for communication requests. Most
51 installations will use a services map entry for the ser‐
52 vice "sge_qmaster" instead to define that port.
53
54 SGE_DELAY_TIME This variable controls the interval in which sge_shadowd
55 pauses if a takeover bid fails. This value is used only
56 when there are multiple sge_shadowd instances and they
57 are contending to be the master. The default is 600
58 seconds.
59
60 SGE_CHECK_INTERVAL
61 This variable controls the interval in which the
62 sge_shadowd checks the heartbeat file (60 seconds by
63 default).
64
65 SGE_GET_ACTIVE_INTERVAL
66 This variable controls the interval when a sge_shadowd
67 instance tries to take over when the heartbeat file has
68 not changed.
69
71 $SGE_ROOT/$SGE_CELL/common
72 Default configuration directory
73 $SGE_ROOT/$SGE_CELL/common/shadow_masters
74 Shadow master hostname file.
75 $SGE_ROOT/$SGE_CELL/spool/qmaster
76 Default master daemon spool directory
77 $SGE_ROOT/$SGE_CELL/spool/qmaster/heartbeat
78 The heartbeat file.
79
81 sge_intro(1), sge_conf(5), sge_qmaster(8), Grid Engine Installation and
82 Administration Guide.
83
85 See sge_intro(1) for a full statement of rights and permissions.
86
87
88
89GE 6.1 $Date: 2007/11/06 18:18:13 $ SGE_SHADOWD(8)