g_tune_pme(1)

1g_tune_pme(1)             GROMACS suite, VERSION 4.5             g_tune_pme(1)
2
3
4

NAME

6       g_tune_pme - time mdrun as a function of PME nodes to optimize settings
7
8       VERSION 4.5
9

SYNOPSIS

11       g_tune_pme  -p  perf.out  -err errors.log -so tuned.tpr -s topol.tpr -o
12       traj.trr -x traj.xtc -cpi state.cpt -cpo state.cpt  -c  confout.gro  -e
13       ener.edr  -g  md.log  -dhdl  dhdl.xvg -field field.xvg -table table.xvg
14       -tablep tablep.xvg -tableb  table.xvg  -rerun  rerun.xtc  -tpi  tpi.xvg
15       -tpid  tpidist.xvg  -ei  sam.edi  -eo  sam.edo  -j wham.gct -jo bam.gct
16       -ffout gct.xvg -devout deviatie.xvg -runav  runaver.xvg  -px  pullx.xvg
17       -pf  pullf.xvg  -mtx  nm.mtx -dn dipole.ndx -bo bench.trr -bx bench.xtc
18       -bcpo  bench.cpt  -bc  bench.gro  -be  bench.edr  -bg  bench.log   -beo
19       bench.edo  -bdhdl benchdhdl.xvg -bfield benchfld.xvg -btpi benchtpi.xvg
20       -btpid  benchtpid.xvg  -bjo  bench.gct  -bffout  benchgct.xvg  -bdevout
21       benchdev.xvg  -brunav  benchrnav.xvg  -bpx benchpx.xvg -bpf benchpf.xvg
22       -bmtx benchn.mtx -bdn bench.ndx -[no]h -[no]version -nice int -xvg enum
23       -np  int  -npstring  enum -nt int -r int -max real -min real -npme enum
24       -upfac real -downfac real -ntpr int -four real -steps  step  -resetstep
25       int   -simsteps   step   -[no]launch   -deffnm   string  -ddorder  enum
26       -[no]ddcheck -rdd real -rcon real -dlb enum -dds real -gcom int  -[no]v
27       -[no]compact  -[no]seppot -pforce real -[no]reprod -cpt real -[no]cpnum
28       -[no]append -maxh real -multi int -replex int -reseed int -[no]ionize
29

DESCRIPTION

31       For a given number  -np or  -nt  of  processors/threads,  this  program
32       systematically  times  mdrun with various numbers of PME-only nodes and
33       determines which setting is fastest. It will also test whether  perfor‐
34       mance  can be enhanced by shifting load from the reciprocal to the real
35       space part of the Ewald sum.  Simply pass your  .tpr file to g_tune_pme
36       together with other options for mdrun as needed.
37
38
39       Which  executables  are  used  can  be set in the environment variables
40       MPIRUN and MDRUN. If these are not present, 'mpirun' and  'mdrun'  will
41       be  used  as defaults. Note that for certain MPI frameworks you need to
42       provide a machine- or hostfile. This can also be passed via the  MPIRUN
43       variable, e.g.  'export MPIRUN="/usr/local/mpirun -machinefile hosts"'
44
45
46       Please  call g_tune_pme with the normal options you would pass to mdrun
47       and add  -np for the number of processors to perform the tests  on,  or
48       -nt for the number of threads. You can also add  -r to repeat each test
49       several times to get better statistics.
50
51
52       g_tune_pme can test various real space / reciprocal space workloads for
53       you. With  -ntpr you control how many extra  .tpr files will be written
54       with enlarged cutoffs and smaller fourier  grids  respectively.   Typi‐
55       cally, the first test (no. 0) will be with the settings from the input
56       .tpr file; the last test (no.  ntpr) will have  cutoffs  multiplied  by
57       (and  at  the same time fourier grid dimensions divided by) the scaling
58       factor  -fac (default 1.2). The remaining  .tpr files will have equally
59       spaced values inbetween these extremes. Note that you can set  -ntpr to
60       1 if you just want to find the optimal number  of  PME-only  nodes;  in
61       that case your input  .tpr file will remain unchanged.
62
63
64       For  the  benchmark runs, the default of 1000 time steps should suffice
65       for most MD systems. The dynamic load balancing needs  about  100  time
66       steps  to adapt to local load imbalances, therefore the time step coun‐
67       ters are by default reset after 100 steps. For large systems (1M atoms)
68       you  may have to set  -resetstep to a higher value.  From the 'DD' load
69       imbalance entries in the md.log output file you can tell after how many
70       steps the load is sufficiently balanced.
71
72       Example call:  g_tune_pme -np 64 -s protein.tpr -launch
73
74
75       After  calling mdrun several times, detailed performance information is
76       available in the output file perf.out.  Note that during the benchmarks
77       a  couple  of  temporary files are written (options -b*), these will be
78       automatically deleted after each test.
79
80
81       If you want the simulation to be started automatically with  the  opti‐
82       mized parameters, use the command line option  -launch.
83
84
85

FILES

87       -p perf.out Output
88        Generic output file
89
90       -err errors.log Output
91        Log file
92
93       -so tuned.tpr Output
94        Run input file: tpr tpb tpa
95
96       -s topol.tpr Input
97        Run input file: tpr tpb tpa
98
99       -o traj.trr Output
100        Full precision trajectory: trr trj cpt
101
102       -x traj.xtc Output, Opt.
103        Compressed trajectory (portable xdr format)
104
105       -cpi state.cpt Input, Opt.
106        Checkpoint file
107
108       -cpo state.cpt Output, Opt.
109        Checkpoint file
110
111       -c confout.gro Output
112        Structure file: gro g96 pdb etc.
113
114       -e ener.edr Output
115        Energy file
116
117       -g md.log Output
118        Log file
119
120       -dhdl dhdl.xvg Output, Opt.
121        xvgr/xmgr file
122
123       -field field.xvg Output, Opt.
124        xvgr/xmgr file
125
126       -table table.xvg Input, Opt.
127        xvgr/xmgr file
128
129       -tablep tablep.xvg Input, Opt.
130        xvgr/xmgr file
131
132       -tableb table.xvg Input, Opt.
133        xvgr/xmgr file
134
135       -rerun rerun.xtc Input, Opt.
136        Trajectory: xtc trr trj gro g96 pdb cpt
137
138       -tpi tpi.xvg Output, Opt.
139        xvgr/xmgr file
140
141       -tpid tpidist.xvg Output, Opt.
142        xvgr/xmgr file
143
144       -ei sam.edi Input, Opt.
145        ED sampling input
146
147       -eo sam.edo Output, Opt.
148        ED sampling output
149
150       -j wham.gct Input, Opt.
151        General coupling stuff
152
153       -jo bam.gct Output, Opt.
154        General coupling stuff
155
156       -ffout gct.xvg Output, Opt.
157        xvgr/xmgr file
158
159       -devout deviatie.xvg Output, Opt.
160        xvgr/xmgr file
161
162       -runav runaver.xvg Output, Opt.
163        xvgr/xmgr file
164
165       -px pullx.xvg Output, Opt.
166        xvgr/xmgr file
167
168       -pf pullf.xvg Output, Opt.
169        xvgr/xmgr file
170
171       -mtx nm.mtx Output, Opt.
172        Hessian matrix
173
174       -dn dipole.ndx Output, Opt.
175        Index file
176
177       -bo bench.trr Output
178        Full precision trajectory: trr trj cpt
179
180       -bx bench.xtc Output
181        Compressed trajectory (portable xdr format)
182
183       -bcpo bench.cpt Output
184        Checkpoint file
185
186       -bc bench.gro Output
187        Structure file: gro g96 pdb etc.
188
189       -be bench.edr Output
190        Energy file
191
192       -bg bench.log Output
193        Log file
194
195       -beo bench.edo Output, Opt.
196        ED sampling output
197
198       -bdhdl benchdhdl.xvg Output, Opt.
199        xvgr/xmgr file
200
201       -bfield benchfld.xvg Output, Opt.
202        xvgr/xmgr file
203
204       -btpi benchtpi.xvg Output, Opt.
205        xvgr/xmgr file
206
207       -btpid benchtpid.xvg Output, Opt.
208        xvgr/xmgr file
209
210       -bjo bench.gct Output, Opt.
211        General coupling stuff
212
213       -bffout benchgct.xvg Output, Opt.
214        xvgr/xmgr file
215
216       -bdevout benchdev.xvg Output, Opt.
217        xvgr/xmgr file
218
219       -brunav benchrnav.xvg Output, Opt.
220        xvgr/xmgr file
221
222       -bpx benchpx.xvg Output, Opt.
223        xvgr/xmgr file
224
225       -bpf benchpf.xvg Output, Opt.
226        xvgr/xmgr file
227
228       -bmtx benchn.mtx Output, Opt.
229        Hessian matrix
230
231       -bdn bench.ndx Output, Opt.
232        Index file
233
234

OTHER OPTIONS

236       -[no]hno
237        Print help info and quit
238
239       -[no]versionno
240        Print version info and quit
241
242       -nice int 0
243        Set the nicelevel
244
245       -xvg enum xmgrace
246        xvg plot formatting:  xmgrace,  xmgr or  none
247
248       -np int 1
249        Number  of  nodes  to  run  the  tests on (must be  2 for separate PME
250       nodes)
251
252       -npstring enum -np
253        Specify the number of processors to $MPIRUN using this  string:   -np,
254       -n or  none
255
256       -nt int 1
257        Number of threads to run the tests on (turns MPI & mpirun off)
258
259       -r int 2
260        Repeat each test this often
261
262       -max real 0.5
263        Max fraction of PME nodes to test with
264
265       -min real 0.25
266        Min fraction of PME nodes to test with
267
268       -npme enum auto
269        Benchmark  all  possible  values  for -npme or just the subset that is
270       expected to perform well:  auto,  all or  subset
271
272       -upfac real 1.2
273        Upper limit for rcoulomb scaling factor (Note that rcoulomb  upscaling
274       results in fourier grid downscaling)
275
276       -downfac real 1
277        Lower limit for rcoulomb scaling factor
278
279       -ntpr int 0
280        Number of tpr files to benchmark. Create these many files with scaling
281       factors ranging from 1.0 to fac. If  1, automatically choose the number
282       of tpr files to test
283
284       -four real 0
285        Use  this  fourierspacing  value  instead of the grid found in the tpr
286       input file. (Spacing applies to a scaling factor of 1.0 if multiple tpr
287       files are written)
288
289       -steps step 1000
290        Take timings for these many steps in the benchmark runs
291
292       -resetstep int 100
293        Let  dlb  equilibrate these many steps before timings are taken (reset
294       cycle counters after these many steps)
295
296       -simsteps step -1
297        If non-negative, perform these many steps in the real  run  (overwrite
298       nsteps from tpr, add cpt steps)
299
300       -[no]launchno
301        Lauch the real simulation after optimization
302
303       -deffnm string
304        Set the default filename for all file options at launch time
305
306       -ddorder enum interleave
307        DD node order:  interleave,  pp_pme or  cartesian
308
309       -[no]ddcheckyes
310        Check for all bonded interactions with DD
311
312       -rdd real 0
313        The maximum distance for bonded interactions with DD (nm), 0 is deter‐
314       mine from initial coordinates
315
316       -rcon real 0
317        Maximum distance for P-LINCS (nm), 0 is estimate
318
319       -dlb enum auto
320        Dynamic load balancing (with DD):  auto,  no or  yes
321
322       -dds real 0.8
323        Minimum allowed dlb scaling of the DD cell size
324
325       -gcom int -1
326        Global communication frequency
327
328       -[no]vno
329        Be loud and noisy
330
331       -[no]compactyes
332        Write a compact log file
333
334       -[no]seppotno
335        Write separate V and dVdl terms for each interaction type and node  to
336       the log file(s)
337
338       -pforce real -1
339        Print all forces larger than this (kJ/mol nm)
340
341       -[no]reprodno
342        Try to avoid optimizations that affect binary reproducibility
343
344       -cpt real 15
345        Checkpoint interval (minutes)
346
347       -[no]cpnumno
348        Keep and number checkpoint files
349
350       -[no]appendyes
351        Append  to  previous  output  files  when  continuing  from checkpoint
352       instead of adding the simulation part number to  all  file  names  (for
353       launch only)
354
355       -maxh real -1
356        Terminate after 0.99 times this time (hours)
357
358       -multi int 0
359        Do multiple simulations in parallel
360
361       -replex int 0
362        Attempt replica exchange every  steps
363
364       -reseed int -1
365        Seed for replica exchange, -1 is generate a seed
366
367       -[no]ionizeno
368        Do  a  simulation including the effect of an X-Ray bombardment on your
369       system
370
371

NAME

SYNOPSIS

DESCRIPTION

FILES

OTHER OPTIONS

SEE ALSO