1CONDOR_DRAIN(1) HTCondor Manual CONDOR_DRAIN(1)
2
3
4
6 condor_drain - HTCondor Manual
7
8 Control draining of an execute machine
9
10
12 condor_drain [-help ]
13
14 condor_drain [-debug ] [-pool pool-name] [-graceful | -quick | -fast ]
15 [-resume-on-completion ] [-check expr] [-start expr] machine-name
16
17 condor_drain [-debug ] [-pool pool-name] -cancel [-request-id id] ma‐
18 chine-name
19
21 condor_drain is an administrative command used to control the draining
22 of all slots on an execute machine. When a machine is draining, it will
23 not accept any new jobs unless the -start expression specifies other‐
24 wise. Which machine to drain is specified by the argument machine-name,
25 and will be the same as the machine ClassAd attribute Machine.
26
27 How currently running jobs are treated depends on the draining schedule
28 that is chosen with a command-line option:
29
30 -graceful
31 Initiate a graceful eviction of the job. This means all prom‐
32 ises that have been made to the job are honored, including
33 MaxJobRetirementTime. The eviction of jobs is coordinated to
34 reduce idle time. This means that if one slot has a job with
35 a long retirement time and the other slots have jobs with
36 shorter retirement times, the effective retirement time for
37 all of the jobs is the longer one. If no draining schedule is
38 specified, -graceful is chosen by default.
39
40 -quick MaxJobRetirementTime is not honored. Eviction of jobs is im‐
41 mediately initiated. Jobs are given time to shut down and
42 produce checkpoints, according to the usual policy, that is,
43 given by MachineMaxVacateTime.
44
45 -fast Jobs are immediately hard-killed, with no chance to grace‐
46 fully shut down or produce a checkpoint.
47
48 If you specify -graceful, you may also specify -start. On a grace‐
49 fully-draining machine, some jobs may finish retiring before others. By
50 default, the resources used by the newly-retired jobs do not become
51 available for use by other jobs until the machine exits the draining
52 state (see below). The -start expression you supply replaces the drain‐
53 ing machine's normal START expression for the duration of the draining
54 state, potentially making those resources available. See the admin-man‐
55 ual/policy-configuration:*condor_startd* policy configuration section
56 for more information.
57
58 Once draining is complete, the machine will enter the Drained/Idle
59 state. To resume normal operation (negotiation) at that time or any
60 previous time during draining, the -cancel option may be used. The -re‐
61 sume-on-completion option results in automatic resumption of normal op‐
62 eration once draining has completed, and may be used when initiating
63 draining. This is useful for forcing a machine with a partitionable
64 slots to join all of the resources back together into one machine, fa‐
65 cilitating de-fragmentation and whole machine negotiation.
66
68 -help Display brief usage information and exit.
69
70 -debug Causes debugging information to be sent to stderr, based on
71 the value of the configuration variable TOOL_DEBUG.
72
73 -pool pool-name
74 Specify an alternate HTCondor pool, if the default one is not
75 desired.
76
77 -graceful
78 (the default) Honor the maximum vacate and retirement time
79 policy.
80
81 -quick Honor the maximum vacate time, but not the retirement time
82 policy.
83
84 -fast Honor neither the maximum vacate time policy nor the retire‐
85 ment time policy.
86
87 -resume-on-completion
88 When done draining, resume normal operation, such that poten‐
89 tially the whole machine could be claimed.
90
91 -check expr
92 Abort draining, if expr is not true for all slots to be
93 drained.
94
95 -start expr
96 The START expression to use while the machine is draining.
97 You can't reference the machine's existing START expression.
98
99 -cancel
100 Cancel a prior draining request, to permit the condor_nego‐
101 tiator to use the machine again.
102
103 -request-id id
104 Specify a specific draining request to cancel, where id is
105 given by the DrainingRequestId machine ClassAd attribute.
106
108 condor_drain will exit with a non-zero status value if it fails and
109 zero status if it succeeds.
110
112 HTCondor Team
113
115 1990-2021, Center for High Throughput Computing, Computer Sciences De‐
116 partment, University of Wisconsin-Madison, Madison, WI, US. Licensed
117 under the Apache License, Version 2.0.
118
119
120
121
1228.8 Aug 23, 2021 CONDOR_DRAIN(1)