1GMX-WHAM(1) GROMACS GMX-WHAM(1)
2
3
4
6 gmx-wham - Perform weighted histogram analysis after umbrella sampling
7
9 gmx wham [-ix [<.dat>]] [-if [<.dat>]] [-it [<.dat>]] [-is [<.dat>]]
10 [-iiact [<.dat>]] [-tab [<.dat>]] [-o [<.xvg>]]
11 [-hist [<.xvg>]] [-oiact [<.xvg>]] [-bsres [<.xvg>]]
12 [-bsprof [<.xvg>]] [-xvg <enum>] [-min <real>] [-max <real>]
13 [-[no]auto] [-bins <int>] [-temp <real>] [-tol <real>]
14 [-[no]v] [-b <real>] [-e <real>] [-dt <real>]
15 [-[no]histonly] [-[no]boundsonly] [-[no]log] [-unit <enum>]
16 [-zprof0 <real>] [-[no]cycl] [-[no]sym] [-[no]ac]
17 [-acsig <real>] [-ac-trestart <real>] [-nBootstrap <int>]
18 [-bs-method <enum>] [-bs-tau <real>] [-bs-seed <int>]
19 [-histbs-block <int>] [-[no]vbs]
20
22 gmx wham is an analysis program that implements the Weighted Histogram
23 Analysis Method (WHAM). It is intended to analyze output files gener‐
24 ated by umbrella sampling simulations to compute a potential of mean
25 force (PMF).
26
27 gmx wham is currently not fully up to date. It only supports pull set‐
28 ups where the first pull coordinate(s) is/are umbrella pull coordinates
29 and, if multiple coordinates need to be analyzed, all used the same ge‐
30 ometry and dimensions. In most cases this is not an issue.
31
32 At present, three input modes are supported.
33
34 • With option -it, the user provides a file which contains the file
35 names of the umbrella simulation run-input files (.tpr files), AND,
36 with option -ix, a file which contains file names of the pullx mdrun
37 output files. The .tpr and pullx files must be in corresponding or‐
38 der, i.e. the first .tpr created the first pullx, etc.
39
40 • Same as the previous input mode, except that the user provides the
41 pull force output file names (pullf.xvg) with option -if. From the
42 pull force the position in the umbrella potential is computed. This
43 does not work with tabulated umbrella potentials.
44
45 By default, all pull coordinates found in all pullx/pullf files are
46 used in WHAM. If only some of the pull coordinates should be used, a
47 pull coordinate selection file (option -is) can be provided. The selec‐
48 tion file must contain one line for each tpr file in tpr-files.dat.
49 Each of these lines must contain one digit (0 or 1) for each pull coor‐
50 dinate in the tpr file. Here, 1 indicates that the pull coordinate is
51 used in WHAM, and 0 means it is omitted. Example: If you have three
52 tpr files, each containing 4 pull coordinates, but only pull coordi‐
53 nates 1 and 2 should be used, coordsel.dat looks like this:
54
55 1 1 0 0
56 1 1 0 0
57 1 1 0 0
58
59 By default, the output files are:
60
61 ``-o`` PMF output file
62 ``-hist`` Histograms output file
63
64 Always check whether the histograms sufficiently overlap.
65
66 The umbrella potential is assumed to be harmonic and the force con‐
67 stants are read from the .tpr files. If a non-harmonic umbrella force
68 was applied a tabulated potential can be provided with -tab.
69
70 WHAM options
71 • -bins Number of bins used in analysis
72
73 • -temp Temperature in the simulations
74
75 • -tol Stop iteration if profile (probability) changed less than
76 tolerance
77
78 • -auto Automatic determination of boundaries
79
80 • -min,-max Boundaries of the profile
81
82 The data points that are used to compute the profile can be restricted
83 with options -b, -e, and -dt. Adjust -b to ensure sufficient equili‐
84 bration in each umbrella window.
85
86 With -log (default) the profile is written in energy units, otherwise
87 (with -nolog) as probability. The unit can be specified with -unit.
88 With energy output, the energy in the first bin is defined to be zero.
89 If you want the free energy at a different position to be zero, set
90 -zprof0 (useful with bootstrapping, see below).
91
92 For cyclic or periodic reaction coordinates (dihedral angle, channel
93 PMF without osmotic gradient), the option -cycl is useful. gmx wham
94 will make use of the periodicity of the system and generate a periodic
95 PMF. The first and the last bin of the reaction coordinate will assumed
96 be be neighbors.
97
98 Option -sym symmetrizes the profile around z=0 before output, which may
99 be useful for, e.g. membranes.
100
101 Parallelization
102 If available, the number of OpenMP threads used by gmx wham can be con‐
103 trolled by setting the OMP_NUM_THREADS environment variable.
104
105 Autocorrelations
106 With -ac, gmx wham estimates the integrated autocorrelation time (IACT)
107 tau for each umbrella window and weights the respective window with
108 1/[1+2*tau/dt]. The IACTs are written to the file defined with -oiact.
109 In verbose mode, all autocorrelation functions (ACFs) are written to
110 hist_autocorr.xvg. Because the IACTs can be severely underestimated in
111 case of limited sampling, option -acsig allows one to smooth the IACTs
112 along the reaction coordinate with a Gaussian (sigma provided with -ac‐
113 sig, see output in iact.xvg). Note that the IACTs are estimated by sim‐
114 ple integration of the ACFs while the ACFs are larger 0.05. If you
115 prefer to compute the IACTs by a more sophisticated (but possibly less
116 robust) method such as fitting to a double exponential, you can compute
117 the IACTs with gmx analyze and provide them to gmx wham with the file
118 iact-in.dat (option -iiact), which should contain one line per input
119 file (pullx/pullf file) and one column per pull coordinate in the re‐
120 spective file.
121
122 Error analysis
123 Statistical errors may be estimated with bootstrap analysis. Use it
124 with care, otherwise the statistical error may be substantially under‐
125 estimated. More background and examples for the bootstrap technique
126 can be found in Hub, de Groot and Van der Spoel, JCTC (2010) 6:
127 3713-3720. -nBootstrap defines the number of bootstraps (use, e.g.,
128 100). Four bootstrapping methods are supported and selected with
129 -bs-method.
130
131 • b-hist Default: complete histograms are considered as independent
132 data points, and the bootstrap is carried out by assigning random
133 weights to the histograms ("Bayesian bootstrap"). Note that each
134 point along the reaction coordinate must be covered by multiple inde‐
135 pendent histograms (e.g. 10 histograms), otherwise the statistical
136 error is underestimated.
137
138 • hist Complete histograms are considered as independent data
139 points. For each bootstrap, N histograms are randomly chosen from
140 the N given histograms (allowing duplication, i.e. sampling with re‐
141 placement). To avoid gaps without data along the reaction coordinate
142 blocks of histograms (-histbs-block) may be defined. In that case,
143 the given histograms are divided into blocks and only histograms
144 within each block are mixed. Note that the histograms within each
145 block must be representative for all possible histograms, otherwise
146 the statistical error is underestimated.
147
148 • traj The given histograms are used to generate new random trajecto‐
149 ries, such that the generated data points are distributed according
150 the given histograms and properly autocorrelated. The autocorrelation
151 time (ACT) for each window must be known, so use -ac or provide the
152 ACT with -iiact. If the ACT of all windows are identical (and known),
153 you can also provide them with -bs-tau. Note that this method may
154 severely underestimate the error in case of limited sampling, that is
155 if individual histograms do not represent the complete phase space at
156 the respective positions.
157
158 • traj-gauss The same as method traj, but the trajectories are not
159 bootstrapped from the umbrella histograms but from Gaussians with the
160 average and width of the umbrella histograms. That method yields sim‐
161 ilar error estimates like method traj.
162
163 Bootstrapping output:
164
165 • -bsres Average profile and standard deviations
166
167 • -bsprof All bootstrapping profiles
168
169 With -vbs (verbose bootstrapping), the histograms of each bootstrap are
170 written, and, with bootstrap method traj, the cumulative distribution
171 functions of the histograms.
172
174 Options to specify input files:
175
176 -ix [<.dat>] (pullx-files.dat) (Optional)
177 Generic data file
178
179 -if [<.dat>] (pullf-files.dat) (Optional)
180 Generic data file
181
182 -it [<.dat>] (tpr-files.dat) (Optional)
183 Generic data file
184
185 -is [<.dat>] (coordsel.dat) (Optional)
186 Generic data file
187
188 -iiact [<.dat>] (iact-in.dat) (Optional)
189 Generic data file
190
191 -tab [<.dat>] (umb-pot.dat) (Optional)
192 Generic data file
193
194 Options to specify output files:
195
196 -o [<.xvg>] (profile.xvg)
197 xvgr/xmgr file
198
199 -hist [<.xvg>] (histo.xvg)
200 xvgr/xmgr file
201
202 -oiact [<.xvg>] (iact.xvg) (Optional)
203 xvgr/xmgr file
204
205 -bsres [<.xvg>] (bsResult.xvg) (Optional)
206 xvgr/xmgr file
207
208 -bsprof [<.xvg>] (bsProfs.xvg) (Optional)
209 xvgr/xmgr file
210
211 Other options:
212
213 -xvg <enum> (xmgrace)
214 xvg plot formatting: xmgrace, xmgr, none
215
216 -min <real> (0)
217 Minimum coordinate in profile
218
219 -max <real> (0)
220 Maximum coordinate in profile
221
222 -[no]auto (yes)
223 Determine min and max automatically
224
225 -bins <int> (200)
226 Number of bins in profile
227
228 -temp <real> (298)
229 Temperature
230
231 -tol <real> (1e-06)
232 Tolerance
233
234 -[no]v (no)
235 Verbose mode
236
237 -b <real> (50)
238 First time to analyse (ps)
239
240 -e <real> (1e+20)
241 Last time to analyse (ps)
242
243 -dt <real> (0)
244 Analyse only every dt ps
245
246 -[no]histonly (no)
247 Write histograms and exit
248
249 -[no]boundsonly (no)
250 Determine min and max and exit (with -auto)
251
252 -[no]log (yes)
253 Calculate the log of the profile before printing
254
255 -unit <enum> (kJ)
256 Energy unit in case of log output: kJ, kCal, kT
257
258 -zprof0 <real> (0)
259 Define profile to 0.0 at this position (with -log)
260
261 -[no]cycl (no)
262 Create cyclic/periodic profile. Assumes min and max are the same
263 point.
264
265 -[no]sym (no)
266 Symmetrize profile around z=0
267
268 -[no]ac (no)
269 Calculate integrated autocorrelation times and use in wham
270
271 -acsig <real> (0)
272 Smooth autocorrelation times along reaction coordinate with
273 Gaussian of this sigma
274
275 -ac-trestart <real> (1)
276 When computing autocorrelation functions, restart computing ev‐
277 ery .. (ps)
278
279 -nBootstrap <int> (0)
280 nr of bootstraps to estimate statistical uncertainty (e.g., 200)
281
282 -bs-method <enum> (b-hist)
283 Bootstrap method: b-hist, hist, traj, traj-gauss
284
285 -bs-tau <real> (0)
286 Autocorrelation time (ACT) assumed for all histograms. Use op‐
287 tion -ac if ACT is unknown.
288
289 -bs-seed <int> (-1)
290 Seed for bootstrapping. (-1 = use time)
291
292 -histbs-block <int> (8)
293 When mixing histograms only mix within blocks of -histbs-block.
294
295 -[no]vbs (no)
296 Verbose bootstrapping. Print the CDFs and a histogram file for
297 each bootstrap.
298
300 gmx(1)
301
302 More information about GROMACS is available at <‐
303 http://www.gromacs.org/>.
304
306 2022, GROMACS development team
307
308
309
310
3112022.2 Jun 16, 2022 GMX-WHAM(1)