1GMX-WHAM(1) GROMACS GMX-WHAM(1)
2
3
4
6 gmx-wham - Perform weighted histogram analysis after umbrella sampling
7
9 gmx wham [-ix [<.dat>]] [-if [<.dat>]] [-it [<.dat>]] [-ip [<.dat>]]
10 [-is [<.dat>]] [-iiact [<.dat>]] [-tab [<.dat>]]
11 [-o [<.xvg>]] [-hist [<.xvg>]] [-oiact [<.xvg>]]
12 [-bsres [<.xvg>]] [-bsprof [<.xvg>]] [-xvg <enum>]
13 [-min <real>] [-max <real>] [-[no]auto] [-bins <int>]
14 [-temp <real>] [-tol <real>] [-[no]v] [-b <real>]
15 [-e <real>] [-dt <real>] [-[no]histonly] [-[no]boundsonly]
16 [-[no]log] [-unit <enum>] [-zprof0 <real>] [-[no]cycl]
17 [-[no]sym] [-[no]ac] [-acsig <real>] [-ac-trestart <real>]
18 [-nBootstrap <int>] [-bs-method <enum>] [-bs-tau <real>]
19 [-bs-seed <int>] [-histbs-block <int>] [-[no]vbs]
20
22 gmx wham is an analysis program that implements the Weighted Histogram
23 Analysis Method (WHAM). It is intended to analyze output files gener‐
24 ated by umbrella sampling simulations to compute a potential of mean
25 force (PMF).
26
27 gmx wham is currently not fully up to date. It only supports pull set‐
28 ups where the first pull coordinate(s) is/are umbrella pull coordinates
29 and, if multiple coordinates need to be analyzed, all used the same
30 geometry and dimensions. In most cases this is not an issue.
31
32 At present, three input modes are supported.
33
34 · With option -it, the user provides a file which contains the file
35 names of the umbrella simulation run-input files (.tpr files), AND,
36 with option -ix, a file which contains file names of the pullx mdrun
37 output files. The .tpr and pullx files must be in corresponding
38 order, i.e. the first .tpr created the first pullx, etc.
39
40 · Same as the previous input mode, except that the the user provides
41 the pull force output file names (pullf.xvg) with option -if. From
42 the pull force the position in the umbrella potential is computed.
43 This does not work with tabulated umbrella potentials.
44
45 · With option -ip, the user provides file names of (gzipped) .pdo
46 files, i.e. the GROMACS 3.3 umbrella output files. If you have some
47 unusual reaction coordinate you may also generate your own .pdo files
48 and feed them with the -ip option into to gmx wham. The .pdo file
49 header must be similar to the following:
50
51 # UMBRELLA 3.0
52 # Component selection: 0 0 1
53 # nSkip 1
54 # Ref. Group 'TestAtom'
55 # Nr. of pull groups 2
56 # Group 1 'GR1' Umb. Pos. 5.0 Umb. Cons. 1000.0
57 # Group 2 'GR2' Umb. Pos. 2.0 Umb. Cons. 500.0
58 #####
59
60 The number of pull groups, umbrella positions, force constants, and
61 names may (of course) differ. Following the header, a time column and
62 a data column for each pull group follows (i.e. the displacement with
63 respect to the umbrella center). Up to four pull groups are possible
64 per .pdo file at present.
65
66 By default, all pull coordinates found in all pullx/pullf files are
67 used in WHAM. If only some of the pull coordinates should be used, a
68 pull coordinate selection file (option -is) can be provided. The selec‐
69 tion file must contain one line for each tpr file in tpr-files.dat.
70 Each of these lines must contain one digit (0 or 1) for each pull coor‐
71 dinate in the tpr file. Here, 1 indicates that the pull coordinate is
72 used in WHAM, and 0 means it is omitted. Example: If you have three tpr
73 files, each containing 4 pull coordinates, but only pull coordinates 1
74 and 2 should be used, coordsel.dat looks like this:
75
76 1 1 0 0
77 1 1 0 0
78 1 1 0 0
79
80 By default, the output files are:
81
82 ``-o`` PMF output file
83 ``-hist`` Histograms output file
84
85 Always check whether the histograms sufficiently overlap.
86
87 The umbrella potential is assumed to be harmonic and the force con‐
88 stants are read from the .tpr or .pdo files. If a non-harmonic umbrella
89 force was applied a tabulated potential can be provided with -tab.
90
91 WHAM options
92 · -bins Number of bins used in analysis
93
94 · -temp Temperature in the simulations
95
96 · -tol Stop iteration if profile (probability) changed less than
97 tolerance
98
99 · -auto Automatic determination of boundaries
100
101 · -min,-max Boundaries of the profile
102
103 The data points that are used to compute the profile can be restricted
104 with options -b, -e, and -dt. Adjust -b to ensure sufficient equili‐
105 bration in each umbrella window.
106
107 With -log (default) the profile is written in energy units, otherwise
108 (with -nolog) as probability. The unit can be specified with -unit.
109 With energy output, the energy in the first bin is defined to be zero.
110 If you want the free energy at a different position to be zero, set
111 -zprof0 (useful with bootstrapping, see below).
112
113 For cyclic or periodic reaction coordinates (dihedral angle, channel
114 PMF without osmotic gradient), the option -cycl is useful. gmx wham
115 will make use of the periodicity of the system and generate a periodic
116 PMF. The first and the last bin of the reaction coordinate will assumed
117 be be neighbors.
118
119 Option -sym symmetrizes the profile around z=0 before output, which may
120 be useful for, e.g. membranes.
121
122 Parallelization
123 If available, the number of OpenMP threads used by gmx wham can be con‐
124 trolled by setting the OMP_NUM_THREADS environment variable.
125
126 Autocorrelations
127 With -ac, gmx wham estimates the integrated autocorrelation time (IACT)
128 tau for each umbrella window and weights the respective window with
129 1/[1+2*tau/dt]. The IACTs are written to the file defined with -oiact.
130 In verbose mode, all autocorrelation functions (ACFs) are written to
131 hist_autocorr.xvg. Because the IACTs can be severely underestimated in
132 case of limited sampling, option -acsig allows one to smooth the IACTs
133 along the reaction coordinate with a Gaussian (sigma provided with
134 -acsig, see output in iact.xvg). Note that the IACTs are estimated by
135 simple integration of the ACFs while the ACFs are larger 0.05. If you
136 prefer to compute the IACTs by a more sophisticated (but possibly less
137 robust) method such as fitting to a double exponential, you can compute
138 the IACTs with gmx analyze and provide them to gmx wham with the file
139 iact-in.dat (option -iiact), which should contain one line per input
140 file (.pdo or pullx/f file) and one column per pull coordinate in the
141 respective file.
142
143 Error analysis
144 Statistical errors may be estimated with bootstrap analysis. Use it
145 with care, otherwise the statistical error may be substantially under‐
146 estimated. More background and examples for the bootstrap technique
147 can be found in Hub, de Groot and Van der Spoel, JCTC (2010) 6:
148 3713-3720. -nBootstrap defines the number of bootstraps (use, e.g.,
149 100). Four bootstrapping methods are supported and selected with
150 -bs-method.
151
152 · b-hist Default: complete histograms are considered as independent
153 data points, and the bootstrap is carried out by assigning random
154 weights to the histograms (“Bayesian bootstrap”). Note that each
155 point along the reaction coordinate must be covered by multiple inde‐
156 pendent histograms (e.g. 10 histograms), otherwise the statistical
157 error is underestimated.
158
159 · hist Complete histograms are considered as independent data
160 points. For each bootstrap, N histograms are randomly chosen from
161 the N given histograms (allowing duplication, i.e. sampling with
162 replacement). To avoid gaps without data along the reaction coordi‐
163 nate blocks of histograms (-histbs-block) may be defined. In that
164 case, the given histograms are divided into blocks and only his‐
165 tograms within each block are mixed. Note that the histograms within
166 each block must be representative for all possible histograms, other‐
167 wise the statistical error is underestimated.
168
169 · traj The given histograms are used to generate new random trajecto‐
170 ries, such that the generated data points are distributed according
171 the given histograms and properly autocorrelated. The autocorrelation
172 time (ACT) for each window must be known, so use -ac or provide the
173 ACT with -iiact. If the ACT of all windows are identical (and known),
174 you can also provide them with -bs-tau. Note that this method may
175 severely underestimate the error in case of limited sampling, that is
176 if individual histograms do not represent the complete phase space at
177 the respective positions.
178
179 · traj-gauss The same as method traj, but the trajectories are not
180 bootstrapped from the umbrella histograms but from Gaussians with the
181 average and width of the umbrella histograms. That method yields sim‐
182 ilar error estimates like method traj.
183
184 Bootstrapping output:
185
186 · -bsres Average profile and standard deviations
187
188 · -bsprof All bootstrapping profiles
189
190 With -vbs (verbose bootstrapping), the histograms of each bootstrap are
191 written, and, with bootstrap method traj, the cumulative distribution
192 functions of the histograms.
193
195 Options to specify input files:
196
197 -ix [<.dat>] (pullx-files.dat) (Optional)
198 Generic data file
199
200 -if [<.dat>] (pullf-files.dat) (Optional)
201 Generic data file
202
203 -it [<.dat>] (tpr-files.dat) (Optional)
204 Generic data file
205
206 -ip [<.dat>] (pdo-files.dat) (Optional)
207 Generic data file
208
209 -is [<.dat>] (coordsel.dat) (Optional)
210 Generic data file
211
212 -iiact [<.dat>] (iact-in.dat) (Optional)
213 Generic data file
214
215 -tab [<.dat>] (umb-pot.dat) (Optional)
216 Generic data file
217
218 Options to specify output files:
219
220 -o [<.xvg>] (profile.xvg)
221 xvgr/xmgr file
222
223 -hist [<.xvg>] (histo.xvg)
224 xvgr/xmgr file
225
226 -oiact [<.xvg>] (iact.xvg) (Optional)
227 xvgr/xmgr file
228
229 -bsres [<.xvg>] (bsResult.xvg) (Optional)
230 xvgr/xmgr file
231
232 -bsprof [<.xvg>] (bsProfs.xvg) (Optional)
233 xvgr/xmgr file
234
235 Other options:
236
237 -xvg <enum> (xmgrace)
238 xvg plot formatting: xmgrace, xmgr, none
239
240 -min <real> (0)
241 Minimum coordinate in profile
242
243 -max <real> (0)
244 Maximum coordinate in profile
245
246 -[no]auto (yes)
247 Determine min and max automatically
248
249 -bins <int> (200)
250 Number of bins in profile
251
252 -temp <real> (298)
253 Temperature
254
255 -tol <real> (1e-06)
256 Tolerance
257
258 -[no]v (no)
259 Verbose mode
260
261 -b <real> (50)
262 First time to analyse (ps)
263
264 -e <real> (1e+20)
265 Last time to analyse (ps)
266
267 -dt <real> (0)
268 Analyse only every dt ps
269
270 -[no]histonly (no)
271 Write histograms and exit
272
273 -[no]boundsonly (no)
274 Determine min and max and exit (with -auto)
275
276 -[no]log (yes)
277 Calculate the log of the profile before printing
278
279 -unit <enum> (kJ)
280 Energy unit in case of log output: kJ, kCal, kT
281
282 -zprof0 <real> (0)
283 Define profile to 0.0 at this position (with -log)
284
285 -[no]cycl (no)
286 Create cyclic/periodic profile. Assumes min and max are the same
287 point.
288
289 -[no]sym (no)
290 Symmetrize profile around z=0
291
292 -[no]ac (no)
293 Calculate integrated autocorrelation times and use in wham
294
295 -acsig <real> (0)
296 Smooth autocorrelation times along reaction coordinate with
297 Gaussian of this sigma
298
299 -ac-trestart <real> (1)
300 When computing autocorrelation functions, restart computing
301 every .. (ps)
302
303 -nBootstrap <int> (0)
304 nr of bootstraps to estimate statistical uncertainty (e.g., 200)
305
306 -bs-method <enum> (b-hist)
307 Bootstrap method: b-hist, hist, traj, traj-gauss
308
309 -bs-tau <real> (0)
310 Autocorrelation time (ACT) assumed for all histograms. Use
311 option -ac if ACT is unknown.
312
313 -bs-seed <int> (-1)
314 Seed for bootstrapping. (-1 = use time)
315
316 -histbs-block <int> (8)
317 When mixing histograms only mix within blocks of -histbs-block.
318
319 -[no]vbs (no)
320 Verbose bootstrapping. Print the CDFs and a histogram file for
321 each bootstrap.
322
324 gmx(1)
325
326 More information about GROMACS is available at <‐
327 http://www.gromacs.org/>.
328
330 2020, GROMACS development team
331
332
333
334
3352019.6 Feb 28, 2020 GMX-WHAM(1)