1CONDOR_DRAIN(1) HTCondor Manual CONDOR_DRAIN(1)
2
3
4
6 condor_drain - HTCondor Manual
7
8 Control draining of an execute machine
9
10
12 condor_drain [-help ]
13
14 condor_drain [-debug ] [-pool pool-name] [-graceful | -quick | -fast]
15 [-reason reason-text] [-resume-on-completion | -restart-on-completion |
16 -exit-on-completion] [-check expr] [-start expr] machine-name
17
18 condor_drain [-debug ] [-pool pool-name] -cancel [-request-id id] ma‐
19 chine-name
20
22 condor_drain is an administrative command used to control the draining
23 of all slots on an execute machine. When a machine is draining, it will
24 not accept any new jobs unless the -start expression specifies other‐
25 wise. Which machine to drain is specified by the argument machine-name,
26 and will be the same as the machine ClassAd attribute Machine.
27
28 How currently running jobs are treated depends on the draining schedule
29 that is chosen with a command-line option:
30
31 -graceful
32 Initiate a graceful eviction of the job. This means all prom‐
33 ises that have been made to the job are honored, including
34 MaxJobRetirementTime. The eviction of jobs is coordinated to
35 reduce idle time. This means that if one slot has a job with
36 a long retirement time and the other slots have jobs with
37 shorter retirement times, the effective retirement time for
38 all of the jobs is the longer one. If no draining schedule is
39 specified, -graceful is chosen by default.
40
41 -quick MaxJobRetirementTime is not honored. Eviction of jobs is im‐
42 mediately initiated. Jobs are given time to shut down accord‐
43 ing to the usual policy, that is, given by MachineMaxVacate‐
44 Time.
45
46 -fast Jobs are immediately hard-killed, with no chance to grace‐
47 fully shut down.
48
49 If you specify -graceful, you may also specify -start. On a grace‐
50 fully-draining machine, some jobs may finish retiring before others. By
51 default, the resources used by the newly-retired jobs do not become
52 available for use by other jobs until the machine exits the draining
53 state (see below). The -start expression you supply replaces the drain‐
54 ing machine's normal START expression for the duration of the draining
55 state, potentially making those resources available. See the admin-man‐
56 ual/policy-configuration:*condor_startd* Policy Configuration section
57 for more information.
58
59 Once draining is complete, the machine will enter the Drained/Idle
60 state. To resume normal operation (negotiation) at that time or any
61 previous time during draining, the -cancel option may be used. The -re‐
62 sume-on-completion option results in automatic resumption of normal op‐
63 eration once draining has completed, and may be used when initiating
64 draining. This is useful for forcing a machine with a partitionable
65 slots to join all of the resources back together into one machine, fa‐
66 cilitating de-fragmentation and whole machine negotiation.
67
69 -help Display brief usage information and exit.
70
71 -debug Causes debugging information to be sent to stderr, based on
72 the value of the configuration variable TOOL_DEBUG.
73
74 -pool pool-name
75 Specify an alternate HTCondor pool, if the default one is not
76 desired.
77
78 -graceful
79 (the default) Honor the maximum vacate and retirement time
80 policy.
81
82 -quick Honor the maximum vacate time, but not the retirement time
83 policy.
84
85 -fast Honor neither the maximum vacate time policy nor the retire‐
86 ment time policy.
87
88 -reason reason-text
89 Set the drain reason to reason-text. While the condor_startd
90 is draining it will advertise the given reason. If this op‐
91 tion is not used the reason defaults to the name of the user
92 that started the drain.
93
94 -resume-on-completion
95 When done draining, resume normal operation, such that poten‐
96 tially the whole machine could be claimed.
97
98 -restart-on-completion
99 When done draining, restart the condor_startd daemon so that
100 configuration changes will take effect.
101
102 -exit-on-completion
103 When done draining, shut down the condor_startd daemon and
104 tell the condor_master not to restart it automatically.
105
106 -check expr
107 Abort draining, if expr is not true for all slots to be
108 drained.
109
110 -start expr
111 The START expression to use while the machine is draining.
112 You can't reference the machine's existing START expression.
113
114 -cancel
115 Cancel a prior draining request, to permit the condor_nego‐
116 tiator to use the machine again.
117
118 -request-id id
119 Specify a specific draining request to cancel, where id is
120 given by the DrainingRequestId machine ClassAd attribute.
121
123 condor_drain will exit with a non-zero status value if it fails and
124 zero status if it succeeds.
125
127 HTCondor Team
128
130 1990-2023, Center for High Throughput Computing, Computer Sciences De‐
131 partment, University of Wisconsin-Madison, Madison, WI, US. Licensed
132 under the Apache License, Version 2.0.
133
134
135
136
137 Oct 02, 2023 CONDOR_DRAIN(1)